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PREFACE 


This volume is a revision and extension of a book by the same 
title published privately in lithoprinted form in 1935. The wide 
demand for the preliminary edition showed that there was a need 
for the type of presentation of statistics here offered. 

The characteristic feature of the book is the effort to explain 
the mathematical origins of the most widely used statistical 
formulas in terms that persons with comparatively little mathe- 
matical training can easily follow. We believe that, if statistical 
workers do not take their tools as magic but understand them in 
the light of their origins and assumptions, they will use these 
tools more intelligently and more safely. In order to make such 
understanding available to persons of little mathematical train- 
ing we give the derivations in much detail. Tt is a well-known 
fact that the source of difficulty in mathematical reading by 
relatively untrained persons is largely the omission of steps which 
are supposed to be obvious. When these steps are supplied and 
when the use of specialized mathematical terminology is reduced 
to a minimum, much that would otherwise be closed to the reader 
is readily understandable. , In order to make calculus available 
as a tool for those who do not have a command,of it, we open 
this volume with a chapter on calculus. This is; of course, only 
“a little calculus,” but it is enough to prepare the reader who has 
not hitherto studied calculus to follow the derivations in which 
we must draw upon this branch of mathematics. Our experience 
with this presentation, as well as that reported by some others, 
shows that this chapter on calculus can be mastered in about 
10 per cent of the time normally allotted to a one-semester course 
in advanced statistics. 

The title of the book is somewhat too pretentious. It might 
better be called Some Statistical Procedures and a Little Insight 
into the Mathematical Bases of a Few of Them. It is not, of course, 
a comprehensive treatment of the mathematical bases of statis- 
ties. It is intended to bridge the gap between the elementary 
courses, in which the formulas are given purely authoritatively, 
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and the original contributions in the monographie press, which 
are often highly mathematical in character. We had hoped to 
be able to include in this volume a section on the geometry of 
hyperspace, matrix algebra, and other forms of advanced mathe- 
matics basic to the reading of present-day statistical literature. 
We had hoped to make this parallel in simplicity our chapter on 
calculus. But we found that, if this were to be made really 
intelligible without being superficial, we would need to allot to it 
an amount of additional space that would not be feasible with- 
out sacrificing the other and simpler functions which this volume 
is intended to perform. An introduction to the geometry of 
hyperspace, as well as to some other forms of higher mathematics, 
is really indispensable to anyone who wishes to follow contem- 
porary statistical theory. But it will probably need to be set up 
in a separate volume—a chatty, leisurely volume. 

In this edition we have included many of the statistical tech- 
niques advocated by R. A. Fisher and have undertaken to bring 
them into synthesis with classical statistics. We do not believe 
that the Fisher techniques will prove to have the importance 
for research in the psychological and social sciences that they 
have in the biological sciences, because in the former fields it is 
unnecessary to work much with small samples and with rough 
exploratory research. Certainly the Fisher techniques will 
only supplement and not supplant. the classical methods in 
the psychological and social sciences. Nevertheless, we believe 
that the workers for whom we are writing in these fields are 
entitled to know what these techniques are. We have attempted 
to take the magic out of them, as we did also out of classical 
statistics, by explaining them in very simple terms and by show- 
ing how they fit in with the older methods. In this way we 
hope to bring it about that the workers in the fields for which we 
are writing will find some useful elements in them without grasp- 
ing at them as some “new magic” and unwarrantedly throwing 
away the vastly useful techniques of classical statistics as 
“antiquated.” 

We are glad to acknowledge our very great indebtedness to 
the work of Prof. Truman L. Kelley. Indeed, when this book 
was first begun it was intended merely as a footnote to his Statis- 
tical Method. Even though frequent explicit references to this 
book are absent, the informed reader will see that our treatment 
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was shaped largely by Kelley’s and often closely parallels his. 
In the further develapment of the work we went, of course, 
directly to the sources in the monographic literature, and hence 
we are obligated to Karl Pearson and the many other scholars 
who contributed to that literature, a very large portion of whom 
wrote under the inspiration and the guidance of Pearson. We 
desire to express our gratitude to members of the mathematics 
department of the Pennsylvania State College, especially to 
Clyde H. Graves, for competent counsel on technical matters 
given unstintedly every time we had occasion to seek such help. 

We are also indebted to Prof. R. A. Fisher and his publishers, 
Oliver and Boyd, for permission to use the table of the distribu- 
tion of t which we reproduce on page 173; to Prof. Egon S. Pear- 
son, editor of Biometrika and the Biometrika Publications, for 
permission to use the two tables on the normal curve integral, 
pages 481 to 487, and the chi-square table pages 498 to 500; and 
to Mrs. Marjory Gosset of Oxford, England—wife and heir of 
William Sealy Gosset, the brilliant English scholar who signed 
his statistical articles “Student ”—for permission to use his tables 
of t published in Metron in 1925, which tables we give on pages 
488 to 493 of this volume. 
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STATISTICAL PROCEDURES 
AND 
THEIR MATHEMATICAL BASES 


CHAPTER I 
A LITTLE CALCULUS 


This chapter is intended for persons who have not previously 
studied calculus. It presents, in a way that a reader who has had 
only a limited training in mathematics should be able to follow, 
practically all the calculus upon which we shall have occasion 
to draw in this volume on statistics, which includes many funda- 
mental elements common also to applications in other fields. We 
trust that this simple presentation of the elements of differential 
and integral calculus may not only prove useful to the student of 
statistics but that it may also give to laymen in mathematics an 
interesting and culturally enriching insight into the nature and 
applications of this fascinating mathematical discipline. 

In every case where one quantity varies in a manner that is 
definitely related to the variation in 
a second, the relation between the 
two may be represented geometric- 
ally by a curve of some shape (in- 
cluding a straight line). Take first 
the simple relation y = x, where £ 
may be represented by any number 
and y will therefore of necessity be 
the same number. We may lay off 
this relation on the adjacentdiagram. Fia. 1.—Straight line relation: 

When x = 1, y = 1; when = 2, Blea 
y = 2, ete. If we go to the right one unit for x and then up one 
unit for y, we shall have a point xy that shows the relation 
between the two series at that value of z, If we go two units to 
the right and two units up, we shall have dsecond point z242; ete. 
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A straight line may be drawn through all these,points. Its slope 
will be 4 3, 3,4, ° °° = 1. Its slope will always be the same 


at all values of x. S 
Suppose, now, that*y = 0.4%. We shall then have a line 


representing the relation as follows: 


Reeaee 
nanan do 


Fie, 2.—Straight line relation: 
Sloman (Oey Neds CN ee Bailes 21024 ernie oaae 


Here likewise we can represent the relation by a straight line, 
and this line will have the same slope at every value of z; at each 
point a change of n units in x will be accompanied by a change 
of 0.4n units in y. 

But let us now take a more complicated case, y = 2. 


2 =0, y=0 
z=}, Taki 
g=1, y=l 
2 = 1h, y = 24 
2 =2, y=4 
z=3, y=9 
r= -4, pate 
z= -1, y=1 
z=- y=% 
z= —2, y=4 
t= —3, y=9 
z= -4 y=16 Fic. 3.—Curved line relation; Slope 


IAEA EAA E Bate anata r changes with x. 


Here the line is not a straight one; it does not have the same 
slope at all values of x. As we proceed out from the y axis, the 
slope is at first very small; at z = 1 it is moderate; and at © = 3 
the slope is very steep. We have a similar behavior on the side 
where z has negative values. We have, in fact, very great 
difficulty in saying what the slope is, because it is always chang- 
ing. We could draw a straight line between A and B, where 
x= 1 and x = 2, respectively, but the slope of this line would 
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not precisely describe the slope of the.curve. If we took a smaller 
change in z, say that represented by the distance AB’ on our scale, 
our secant line would more nearly coincide with the curve. We 
may consider A as any fixed point on the graph and allow B to 
move along the curve and approach A as a limiting position, 
The changes in x would become smaller and smaller and approach 
zero as a limit. The secant line drawn through A and B would 
turn about A, approaching the tangent line at A as its limiting 
position. è 

Now the basic task of the differential calculus (except in 
regions of discontinuity and other similar matters which lie 
beyond the scope of this chapter) is to ascertain the slope of a 
curve at various points by determining the slope of a secant 
which, in a limiting position, becomes the tangent to the curve 
at the point in question. This same idea may be expressed in 
other terms by saying that it is the task of the differential 
calculus to ascertain the amount of change in a variable y that 
corresponds to a certain change in a related variable « as these 
increments in the independent variable x become so small as to 
approach zero in value. At certain times in its history this 
discipline has been called the infinitesimal calculus in recognition 
of the fact that it deals with the relation of infinitesimal incre- 
ments of one variable to infinitesimal increments of another. 
In operating with the calculus we are often operating algebraically 
with no curve in sight; but usually we can represent these 
algebraic operations geometrically and show that what we are 
seeking is something about the slope of that curve at some 


point. 


DIFFERENTIATION 
Let us proceed with that algebraic process with which we said, 
in our preceding paragraph, we shall often be operating with 
no curve before us visually. We have the equation 
y =o? E 
We wish to find what change in y goes with a change in v at 
any value of x in which we are interested. Let Az be an incre- 
ment to be added to x (algebraically), and Ay be the correspond- 
ing increment that would need to be added to y in order to 
maintain the equation. 
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(1) y + Ay = (© + Az)? ,; 
Performing the indicated square, : 

y + Ay = 2? + 2¢ Ax + Az” 
But our original equation gave us y = x? We may subtract 
the terms of this equation from the corresponding ones of our 
last ‘equation above on the basis of the axiom, “If equals be 
subtracted from equals the remainders are equal.” We shall 
then have 


(2) Ay = 2a Ax + Dr 
Dividing through by Az, we shall have 


@) BY = 20+ As 


We said Az should be an increment added to x and Ay an 
increment added to y, but we did not commit ourselves as to 
the particular size of the increment. Let us now conceive of 
Ax as decreasing until it becomes infinitesimal in size. It will 
necessarily drag Ay down with it, since the equation must con- 
tinue to hold for all values of Ax. When Ax has become so small 
as to have approached zero as its limit let us replace Ay/Ax by 
dy/dx. At this limit the Ar in the last term of our equation will 
approach zero in value and thus disappear from consideration. 
The reason why dy/dx can not be similarly dropped as of zero 
value is that both its numerator and its denominator become 
small together so that the fraction has a value that may be of 
considerable dimension. And so as the limit zero is approached 
by Az we have 


(4) dz = 2y 


This 2% is called the derivative of the expression y = x°. The 
progess of getting it is called differentiation. If the dx appears 
in the denominator of the fraction expressing the derivative, we 
say that we are differentiating “with respect to x”; if the dy 
appears in the denominator, as it will sometimes do in our later 
developments, we say we are differentiating “with respect to a 
This process alone, in more or less complicated forms, constitutes 
essentially all there is to the differential calculus. In terms of 
the slope of a curve a derivative equal to 2x means that, at the 
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point where « = 1, the slope of the curve is 2 times 1 or 2 (which 
means that at that point the y values are changing twice as 
rapidly as the z values through the infinitesimal distance to which 
we have shrunken our Az at its limit). At the point where z = 3, 
the slope of the curve relating the two is 2 times 3 or 6, which 
means that y changes 6 units for each unit of change in z. 

After a few more concrete examples we shall seek a general 
rule for differentiating an expression directly without going 
each time through a long process of algebraic manipulation. 
But in the meantime the reader may be interested in observing 
the relation of the form of the 2z to the x? of which it is the deriva- 
tive. He will notice that the exponent of the z has dropped from 
2 to 1, a decrease of one unit. He will also observe that the 
coefficient of the derivative has become 2, possibly the same 2 
that was lost from the original exponent; about that we shall 
see later. 

Let us now try differentiating the expression y = 2°. We 
shall go through the same four fundamental steps, through 
which we passed in our previous example, as follows: (1) Add 
Ay to the y and Az to the x and perform the indicated involution. 
(2) Subtract from the resultant equation our original equation. 
(3) Divide through by Az. (4) Let Az approach zero as a limit, 
and, as the limit is approached, substitute dy/dx, the symbol 
for the derivative at the limit, for Ay/Az; drop from the equation 
any of these Ax values that stand without a A denominator, on 
the ground that even in the first power their values are approxi- 
mately zero and that in any of the higher powers the values 
are lower than in tlie first power. 


y =r 

Adding Ay to y and Az to 2, 
a) y + Ay = (z + Ax)? 
Expanding the second term, 

y + Ay = 28 + 80%Ax + 8x Ax? + Az? 
Subtracting, 
(2) Ay = 3x%Aa + 32 Az? + AT 
Dividing by Az, 
8) W L 32% + Bz Ar + Bat 
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Letting Ax approach zero, 
(4) cf = 32 


In this last expression the 32 Az dropped out in the limit 
because as Ax approaches zero as a limit, any product formed 
by multiplying it by any factor approaches zero and, in the 
limit, disappears from the equation. For a similar reason the 
Tr? becomes zero in the limit. In fact since Ar becomes, as it 
decreases, a very small quantity (7.¢., a decimal quantity less 
than 1), when raised to any power (including 1) and multiplied 
by 1 or by any other factor it will approach zero and vanish 
from the equation as Az approaches zero as its limit. 

The derivative of y = x° is, therefore, 3x*. Notice that here, 
again, the original exponent has become the coefficient of the 
derivative and that the exponent of the v in the derivative is 
one less than that of the original quantity. Let us now take a 
more generalized example, 


y = 2" 


where n may represent any positive integer.! Performing in 
succession our four fundamental steps, 


(1) y + Ay = (z + Az)" 

Expanding, 

y + Ay = a” + naar + Le) one 
ee: 
y= a 

was our original equation, to be subtracted, 

(2) Ay = na"“Axr + Mn Die 
+ n(n — V)(n = 2) asar Be ras 


1:2°3 


1 It may be shown that the rule for differentiating functions of this form 
will hold for any real value of n. 
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(3) a = net + Ai 
iV arse J gr-tag? fee 
Letting Az approach zero as limit and observing what was 
said above about the vanishing of all powers of Az that stand 
without a A denominator, we have remaining 
ay = nor} (Derivative of the function y = 2") (1) 
From this general case it is now obvious that what we inferred 
as a possibility in our two previous examples is in fact the rule: 
the derivative in respect to x at any power has as its coefficient the 
original power of x and as its exponent the original exponent 
decreased by 1. This must be so because only the second term in 
the binomial expansion is free from the Az after the subtraction 
of step (2) and the division of step (3) and because the coefficient 
of the second term in a binomial expansion is always the power 
to which the binomial is being raised and its exponent is always 
the original exponent less 1. 
Suppose now we try the effect of a constant as coefficient of 
our variable z, 


y = ax" 
Here a may represent any coefficient we please, whether integral 


or fraction, whether positive or negative. Going through our 
four steps, 


a) y + Ay = a(z + Az)” 


y + Ay = ar” + anz" Az + a aKa? 

a an(n = ee Sa) ten Pale sc 
(2) Ay = anz"Ar + ee an" 

q ann en 2 ae ee 
(a) SY = anon + ante =D) potas 


an(n—1)(n—2) poo, 
SOREL Oe a Tal TONG T 
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dy = (Derivative of a constant times 
(4) a ang" a function of the form z") (2) 


Here the constant reappears in' the derivative unchanged. 
Hence, since a may represent any Constant, we may say: The 
derivative of a constant times a function is the same constant times 
the derivative of the function. 

Let us now take a more complicated expression, one involving 
x in each of several terms with different powers in each term, the 
total function being the sum of these several functions. 


y = ax? + bz? + ox 


(1) y + Ay = ar? + 8ax%Ac + Bax Ax” +a de® ak bat 
+ 2b Ax + b Ax’ + cx + ¢ Ax 


y = ar’ + bx? + cx 
(2) Ay = 8ax7Ax + 3ax Ax? + a Ax’ + 2br Ax + b Ax? + cAr 
(3) fe = Bas? + Bax dx + a Te’ + 2ba + b Ar +c 
(4) $Y = Bax? + 2be +c re) 


If the reader will compare this derivative with the expression 
we started out to differentiate, he will observe that the derivative 
of the complex quantity made up of the sum of three terms is 
precisely the sum of the derivatives of the several terms if dif- 
ferentiated separately. If he will carry through on paper the 
generalized case or visualize to himself how it would work out, 
he will easily convince himself that that same conclusion would 
hold universally for any values for which the binomial law holds. 
Therefore, the derivative of the sum of any number of functions is 
the sum of their derivatives. 

Let us now try differentiating a constant, y= a. Since 

= 1, the above equation might be written 


y = xa 
Going through our four steps, 
a) y + Ay = (x + Az)"a =a 


Subtracting our original equation, y = a, we get 


A 
(2) Ay =0. Then (3) $Y = 0; and (4) = x 
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Thus the derivative of a constant is found to be zero. If the 
reader will think of the a as placed as an addend in the series 
above when we were generalizing about the derivative of a sum 
of functions, he will perceive that it would behave in the same 
manner there as when standing alone; t.e., its derivative would be 
zero, and it would not appear in the sum which constitutes the 
derivative of the complex function. So in the process of dif- 
ferentiation, any constant independent of the variable with 
respect to which we are differentiating disappears entirely from 
the derivative, since its own derivative is zero. 

We have now covered the smplest cases of differentiation. 
We have yet to consider the complicated situations in which the 
term with respect to which we are differentiating occurs as an 
exponent, as a product, as the denominator of a fraction, etc. 
But before we proceed further let us draw together our findings 
so far, when we differentiate a function with respect to x, and 
make some applications of them. 

1. The derivative of a simple monomial containing any power 
of z is another monomial containing as its coefficient the original 
exponent of x times the original coefficient and as its exponent 
the original exponent diminished by 1. 

2. The derivative of a constant times a function is equal 
to the constant times the derivative of the function. 

3. The derivative of a sum of monomials is the sum of the 
derivatives of these monomials. 

4. Terms independent of x disappear from the derivative when 
we are differentiating with respect to x, since the derivatives of 
any such terms are zero. This is on the assumption, of course, 
that these terms do not contain the y or any other function of z. 

On the left below we shall place certain equations to be differ- 
entiated with respect to x; on the right we shall indicate the 
derivatives of some of them while leaving others blank for the 
reader to complete as an exercise. 


Function Derivative 
y = 4r? 122? 

y = 25 +10 5zt 

y = 627 —1277 

y =a? —32 —5 mace 
v=% oe $7 e245 


y = Tat — dat oct — 22-4 
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MINIMA AND MAXIMA 

In order to get afresh in mind the meaning of a derivative, let 
us graph the equation y = z? — 3x — 5, Its derivative, as 
given above, is (2x — 3). This indicates the rate at which y 
is changing when z has any given value. When put graphically, 
it means that at any value of z the slope of the line representing 
the relation of y to x is (2x — 3). Let us list below some values 
of z and some corresponding values of y derived from the original 
equation. In Fig. 4 we shall locate these calculated points and 
shall draw through them a smooth curve. As said above, the 
fact that the derivative is (2x — 8) indicates that the slope of the 
curve at any value of x is (2c — 3) and that at any such value 
of x the y is changing (2x — 3) times as rapidly as the z. Exa- 
mine the curve at a few points in order to confirm this fact. 
When x = —2, the derivative is (2: —2) — 3, 


ys 


| 
o 
ee) tee 


Oarawnaraw 


Fic. 4.—Graph of the equation +6 

Vit OP Sas Gi) ae CE + haiti 
which equals —7. This statement indicates that the y value 
is decreasing seven times as fast as the x is increasing—that, if 
one followed the graph with a pencil, the pencil would be moving 
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downward seven times as rapidly as it is moving across the page 
toward the right. Does inspection of the curve indicate to 
you that such is true? Lep z = 1$. Then the derivative is 
(2-11 — 3), which equals zero. This means that here y (the 
vertical distance) does not change at all as you move along the x 
(horizontal) direction—provided, of course, the distance through 
which you move is an extremely short one. Does the figure 
bear that out? Lets be3. Then the derivative is 


2-3-3) = +3. 


The y should be increasing three times as rapidly as the v is 
increasing. Does that look plausible? The reader may be 
interested in making additional suppositions about x values and 
in seeing how the derivative indicates the slope of the curve at 
those points and consequently the relative rapidity of changes in 
y as compared with changes in x at the points in question. No 
matter what the connection, differentiation always has precisely 
the sort of meaning and significance involved in this illustration. 

Figure 4 lends itself so well to a comment about minima that 
we cannot refrain from entering that topic here. That will 
carry us, even at this early stage of our study, into the very 
heart of one of the most important applications of the differential 
calculus. We saw that when z = 14, the slope of our curve was 
zero. To this point the slope has been negative; t.e., to this 
point it has been descending for increasing values of x. From 
this point on the slope is positive; t.e., the curve ascends for 
increasing values of z. Consequently the lowest value that y 
can take lies at this point where x equals 14. In other words, 
y is then at a minimum. To find the point in æ values where y 
is at a minimum is one of the most important applications of the 
calculus. We chanced to take 14 as a value for x, and the slope 
turned out to be zero. But we could easily have gotten this by 
calculation. We could have set (2x — 3) equal to zero and 
solved the equation to find x under this special assumption that 
(2x — 3) is to equal zero. We would have the following simple 
operation: 

2a — 3 = 0; 2x = 3; therefore x = 15 


Always, when we wish to find a minimum, we differentiate our 
function with respect to the variable on the scale of which 
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we desire to know the position of the minimum in the related 
variable, set the derivative equal to zero, and solve for the 
unknown term. Since this matter of minima is so important in 
statistics for the sake of which we are at present studying calculus, 
as well as in most other areas in which caleulus is employed, 
perhaps it will pay us to stay longer with the topic and illustrate 
it more fully. So let us take another example—one right out of 
statistics. 

In the following equation y represents the errors squared in 
fitting a straight line to paired numbers. If we can find a value 
for r that will make this y a minimum, we shall have one formula 
for a coefficient of correlation. 


y= 1 ryp 


The variables here are the y and the r; all other terms are con- 
stants. We wish to find that value for r that will make y a 
minimum. We must, therefore, differentiate the expression 
with respect to r and set the derivative equal to zero. Remember - 
that, if a constant is independent of the variable with respect 
to which we are differentiating, it will disappear from thederiva- 
tive because its own derivative is zero. But remember that, if it 
occurs as a coefficient of our variable, it will reappear as a coeffi- 
cient in the derivative. Differentiating, then, according to our 
rules, 


dy z _ 222122 N 2r 


dr N 
Now set this derivative equal to zero and solve for r. 
2De2yz. py 
-2r = — N 2; whence r = oad 


This is the formula for the coefficient of correlation when our 
data are in the form of “standard measures.” But we shall learn 
more of this later; just now our attention is focused on the method 
of finding that value of the one variable at which the other is a 
minimum, 

Try next 


y = 8r — 2? 


A LITTLE CALCULUS 13 


Differentiating, we get 
dy 


aa 8 — 2r 


Setting the derivative equal to zero and solving for x, we have 
8 — 2x = 0; —2x = —8; therefore v = 4 


Our minimum should be where x = 4. Let us see how that looks 
onagraph. We shall determine from the original equation some 
values for y from given values for x and then construct the curve 
passing through these points. 

y 


a 


Y 


CONONRWNROR 
m 
a 


“a8 Fie. 5.—Graph of the equation 
EN y = 82 — 2, 

Wait a minute! The curve is flat at x = 4, that is true, but y 
is not at its lowest point. It is at its highest point instead. So 
far from being a minimum at z = 4, y is then at a maximum. 
Now that we think of it, we see that we may have aslope of zero, 
and consequently a horizontal direction of our graph, at the top 
where the curve has stopped mounting and has begun to descend 
as surely as at the bottom where it has stopped falling and has 
begun to ascend. When, therefore, dy/dx = 0, y is either a 
minimum or a maximum. How can we tell which? 

In the sort of curves with which we have been dealing, the 
slope of the line itself changes for different places along the 
z axis. At some points it becomes steeper and at others less 
steep; at some points it is rising and at some it is descending. 
Evidently the change of slope is itself a function of x; t.e., the 
rapidity of change in the slope is predetermined by the place 
along the z axis with which we are concerned. We might, 
therefore, differentiate the expression for the slope itself and get a 
value for the rapidity and direction with which the slope itself is 
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changing for various values of z. If at the point in which we 
are interested (because, perhaps, at that point the y is either a 
maximum or a minimum), the second derivative has the negative. 
sign, that means that, if we proceeded toward the higher values 
of.a, our slope would bend in the minus direction—downward. 
We would, therefore, have been at the top of our curve, and our y 
would have been at a maximum. If, however, the sign of the 
second derivative is positive, that means that, as we move 
from that point up the æ scale, our curve must bend upward. 
(which is the plus direction), and we learn thereby that what we 
had was a minimum value of y. Let us try this on the last 
example differentiated above. Our derivative was, you remem- 
ber, (8 — 2x), and at our critical point this was equal to zero, 
so that we had 


We may designate the second derivative by d»y/dz?. Taking 
this second derivative according to the same rules we use for 
a first derivative 

dy _ a(S — 2x) _ GS 

dz? dz 7a 

Sure enough, the second derivative is negative. It is this 
fact that it is negative rather than that it is numerically equal 
to 2 that interests us at present, for our only concern now is to 
know whether the curve would bend upward or downward if 
we proceeded out from this point. Our finding from the second 
differentiation is consistent with our graph; y is a maximum 
when x = 4. 

Let us now go back and see whether we were correct in sup- 
posing that our previous two examples involved minima rather 
than maxima. In our first example (the graphed one) 

y = 2? — 3r 5; 2x 3; 48 = 
This second derivative has the plus sign, and we were, therefore, 
correct in calling the y a minimum at the point where the slope 
was Zero, 

In our second illustration, involving the correlation formula, 

Pi ate Ly su A or e 2 
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Here, again, the second derivative is positive in sign, and, 


‘consequently, we have a minimum for y. 


Let us take yet one more exercise in this interesting topic of 
maxima and minima. A gardener wishes to enclose a tract of 
land with a high deer fence, and he wishes to know in. what 


‘proportions he should lay out a rectangular plot so as to get 


the maximum amount of space enclosed with a given amount 
of fence. If we let k equal half of the given perimeter of his 
proposed tract, his diagram will look 

like Fig. 6. Let ybethearea. Since 

the area of a rectangle is the product x 


of its length by its width, oy 


y = 2(k — 2) = ke — 2? Fre. 6. 


Differentiating-this so as to ascertain for what length of x the 
area y will be a maximum and solving for z, 


dy _ al a Me esha iar 9 eae 
Fe rks 22 = 0; —2r = kya=ok 
In order to make sure we have a maximum and not a minimum, 


we shall take a second derivative. 


dy _ d(k — 2x) _ ans 
dx? dz (i 

The second derivative is negative, and, therefore, what we 
have is a maximum value for the area. So the field will be laid 
out most economically if the length is half the sum of the length 
and the width; t.e., if the length equals the width and hence the 
tract is laid out square. 

We shall later learn that there are circumstances under which 
we need to take a third derivative. It is, in fact, possible to 
differentiate as many times in succession as we please and as our 
purpose requires. 

After this excursion into maxima and minima, from which we 
hope the reader will have derived a more complete comprehension 
of the meaning and possible applications of the process of 
differentiation, we shall take up again the technique of differ- 
entiating different algebraic forms. So far we have had only 
the simplest ones. We have to learn how to differentiate a 
product, a fraction, a power, a logarithm, ete. 
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THE DERIVATIVE OF A PRODUCT 

We have already learned how to handle the differentiation 
of a function of z that involves a polynomial. From now on we 
shall let a single letter (say u or v) represent the function of x 
no matter how complex that function may be; for we know that, 
if we are called upon to differentiate u or v in any connection, 
we shall be able to differentiate the complex function of x that 
these simple symbols represent. Using u and v, then, to stand 
for functions of x, we shall inquire how to differentiate a product. 


(1) y = w 
y + Ay = (u + Au) (v + Av) 


Expanding, 
y + Ay = w + u Av + v Au + Au Av 


Subtracting our original equation, 


(2) Ay = u Av + v Au + Au Av 
Dividing by Az, 
Ay _uAv , vAu Av 
6) ie ae a 7 
(4) ty = a +0 a (Derivative of a product) (4) 


The reader must have in mind in the transition from step (3) 
to step (4), and must continue to hold in mind in this transition 
in all the following developments, that we make the transition 
by letting Ax approach zero as its limit whereby the A’s in the 
numerator will be dragged down with the Az; and, as the Ax 
approaches its limit, we replace Ay/Az by dy/dx, or whatever 
other symbols happen to represent our functions in the particular 
problem. He must keep in mind, too, that, when a A is approach- 
ing zero as a limit and is not divided by another A which is also 
approaching zero as a limit, it drops out of the equation at the 
limit because its value is zero and it carries out with it all other 
factors by which it is multiplied. 

Thus the derivative of a product turns out to be the first 
factor times the derivative of the second plus the second factor 
times the derivative of the first. The following is a more con- 
crete example: 
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y = (aryo). SU = MOE (ory p AEEY (axe 


= (4az? - bx?) + (2bx + axt) = 4abzë + 2abz* = 6abz* 


That is exactly what we would have obtained if, at the beginning, 
we had multiplied our two factors together and had differentiated 
the product, as the reader may wish to verify. In this case that 
would have been just as simple. But, of course, not all cases 
would permit such ready combination of the original factors. 


THE DERIVATIVE OF A QUOTIENT (FRACTION) 
Taking again u and v as functions of z, let y = u/v. 


_ ut dAu 
(1) yt AU = or AD 
. Subtracting from this the original equation, 
_utAu wu 
(2) Ay = apa Y 


Raising both fractions to a common denominator so that we may 
subtract, 


vlu + Au) _ ulv + Av) uv + v Au — uv — u Av 


cua vy + Av) vv -+ Av) v? + v Av 
A _ vAu—wdv 
Y= “yo do 


(3) Dividing by Az, 


Ay _ o(Au/Ax) — u(Av/Ax) 
az v? + v Av 


Letting Ax approach zero as a limit, 


ay = v(du/de) 7 u(dv/dz) (The derivative of a quotient) (5) 


THE DERIVATIVE OF A FUNCTION OF A FUNCTION 
It often happens that an expression is complicated in a fashion 
that makes it' difficult to differentiate in the straightforward 
manner we have so far learned. We may then find it feasible 
to simplify our procedure by dividing the process into twe or 
more steps. Let us write the function, 
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Ay _ Ay, Au 


Az Au Az 


where y is a function of u and u is a function of x. Now let 
Az approach zero as a limit. As Ax approaches zero as its 
limit, it will necessarily carry with it the A’s of its functions. 
Hence 

dy _ dy du 


dr du dz 


That is, we may break up our expression into two factors, 
differentiate the first with respect to u and the u with respect 
to z, and take as our derivative the product of these two. Take, 
for example, the expression, y = v (x° — 2a)’, which equals 
(z? — 2a). We may let (xè — 2a) equal u. Then y = ug; 
dy/du = uł. Now differentiating the expression for which 
u stands, viz., (xë — 2a), with respect to x, we have du/dx = 32°. 
Taking the product of these two derivatives, but substituting 
the value of the u for the u, we have 


dy _ dy du 
dr du dz 
Recourse to this dodge often makes comparatively easy 
differentiations that would otherwise be extremely difficult, 
and workers with calculus exercise great ingenuity in discovering 
ways in which to break up expressions into component factors 
that are more readily differentiated than the original one. 
Sometimes an expression is broken up into three or more factors, 
for evidently 


=2@ 2a) - an? = W Vat = Ba 


dy _ YY, du dv dz... (Derivative of a function (6) 
de du dv dz dz of a function) 


THE DERIVATIVE OF AN INVERSE FUNCTION 
Another roundabout method that sometimes simplifies the 
process of differentiation is to shift temporarily from the necessity 
of differentiating with respect to x and to do the differentiating 
instead with respect to y, then to reach the differentiation with 
respect to x by a second step. By the same process as that used 
in the preceding section, it may be proved that 


` dy 1 dz 


== Hence Lie 
dx dz/dy dy/dx d 
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We may, therefore, differentiate with respect to y (as is indicated 
by the fact that the dy occurs in the denominator), then use 
the reciprocal of this derivative as the derivative of y with respect 
tox. Suppose, for example, we have the equation: y? = 3x + 4. 
We might transpose and solve for x as follows: 


3 Bit ES 
Differentiating now with respect to y, 

dz _ 2y 

ayn 8 


Since, as shown above, 
ayn. HL ) AE i O AINE) 
dx dx/dy dt 2y/3 2y 
Substituting in this last equation the value of y from the origina] 
equation, we have as our derivative: 
° 


dy _ 3 


INTRODUCING A NOTABLE CHARACTER—e 
There is a remarkable quantity in mathematics to which we 
must give attention before we can proceed further. It is desig- 


nated by the letter e and has as its value ( + 3 as the n 


approaches infinity. Let us expand this value according to the 
binomial theorem and through this expansion determine the 
numerical value of e. The reader must remember that 1 raised 
to any power is still 1. 


sall TA 1 n(n —1) , n(n — 1)(m — 2) 
e= (143) AG Wigs at EOE 
4 nln — 1)(n — 2)(n — 8) 
n= 122-3" 4 


But as n approaches infinity the (n — 1), (n — 2), ete., will 
not differ appreciably from n, so that the factors containing 
n’s will cancel out of each numerator and corresponding denomi- 
nator. This will be particularly true near thé beginning of the 
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series, where the fractions have an appreciable size; and to the 
extent to which it is not true it will force more rapid convergence 
of the series, since the n factors in the numerator are smaller 
than the corresponding ones in the denominator. We shall 
then have 


1 1 1 
23tTpe3s4tTo345" 


1 
e=1+1+73tT 


This series rapidly converges and, while incommensurable with 1, 
has as its correct value to six decimal places 2.718281. To two 
decimal places e may be taken as 2.72. 

If we develop, and then differentiate, the value of e”, we shall 
begin to see wherein lies the remarkable property of e. 


e- (14A ee 


n m-1-2 
, na(nz — 1)(na — 2) aye 
+ P23 ie 
Since the n’s cancel out for the same reason as given above, this 
becomes, as n approaches infinity, 


a at 


Pa. os g? 
CG E EEEIEI p E E 


+- 


Let us now differentiate this with respect to z, indicating the 
differentiation on the left and performing it on the right. 
d(e”) on 4r? 
da NEE TE EAE 


g? T? 
a e TD E cre 


Ait gat 


1 


But that is just what we had before. If we should continue 
to take successive derivatives we would always get the same 
thing we had to start with. e” has, therefore, the remarkable 
property of giving a derivative exactly equal to the variable itself. 
This is a property of immense importance in higher mathematics. 


THE DERIVATIVE OF A LOGARITHM 
We are now in position to develop a formula for the derivative 
of a logarithm. Since logarithms are treated in practically all 
texts in algebra, except some of those intended for a single first- 
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year course, we shall assume here that the reader is already 
familiar with them or that he will take occasion at once to go 
to a textbook in algebra to learn about them. A logarithm is a 
power to which a certain number, called the base, must be raised 
in order to give another number in which we are interested. 
Thus the power to which 10 must be raised in order to give 100 is 
2; hence 2 is the logarithm of 100 to the base 10, The power to 
which 10 must be raised to give 247 is 2.3927; hence that is the 
logarithm of 247 to the base 10. But in calculus we seldom use 
the base 10; we use e instead, because of the remarkable properties 
we said above that it possesses. However, logarithms to the 
base e obey in every respect precisely the same laws as those with 
which the reader is, presumably, already familiar with the base 10. 

Now for the derivative of a logarithm. Where » is any func- 
tion of x, let y = log, v. We shall carry this through the four 
fundamental steps through which we carried earlier processes 
of differentiation. 


(1) y + Ay = log, (v + Av) 
Subtracting original equation, 
(2) Ay = log, (v + Av) — loge v 


It is one of the principles of logarithms that the log of one 
quantity minus the log of a second equals the log of the first- 
quantity-divided-by-the-second. Hence (2) becomes 


Ay = log. PAo) + Av) 


Dividing by Av, 


Ay _ 1 eta) d ( a) 
Av Av (iog. v ~ Av log. (1 + v 


We may multiply and divide the right-hand member by v without 
changing its value. Hence 


ay 1. %tog,(1 +) 


But anything of the form b - log u may, according to the laws of 
logarithms, be written log uv’. Hence we may write 


A 1 Avìa 
@) AY o Tog, (1 + 
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The quantity in parentheses is ofthe form (1 + 2) - Now that 
is a familiar expression, We met*it in our preceding section. 
Where the n ix supposed to increase to infinity, it is precisely our 
old friend e. If we let Sv approach zero as its limit, the exponent 
v/de will approach infinity, as it should to make our expression 
. in parentheses equal e. Also, as Ar approaches zero as its limit, 
4,/Se will become dy/de, and we shall have our derivative 


tee 
But any log of its own base is 1, Hence log, e = 1 and we have 


dal 
F 
Mali n ldt 
+ & r 


Remember that we started with the equation y = logar. Putting 
this value for the y in our differential equation, we have: 


dilog») - wie (Derivative of s logarithm) (7) 


‘The derivative of a logarithm to the base e of any function of 
z in, therefore, the derivative of that function divided by the 
funetion itself, Take the following concrete example: 


y= + dz — 8. 
dan + de — 8), 1 met 
Hye- Fyi 
THE DERIVATIVE OF A POWER FORM 

In this ease we have shifted our z function to a place where it 
would mem to be very difficult to get at—to the exponent. Let 
a be any constant and v any function of z to which a is raised as 
a power, and let b be any coefficient of the a*; ie., let y = ba”, 
Taking logarithms to the base ¢ of each side of this equation 
(hereafter it is to be understood that our logarithms are always 
taken to the base e without our writing the ¢ as a subscript), wo 


have 
log y = log (a*) + log b = u log a + log b 
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Transposing and solving for w, 


b 1 b 
T- i- l cea be 
Wo shall now differentiate with respect to y. Since aisa con» 
stant, 1/log a iy a constant and will reappear as such in tho 
derivative, Likewise log b is a constant, and therefore log a/log b 
is a constant, But, sinee this constant is independent of y, ita 
derivative is zoro, so that it will disappear from the derivative 
of the whole expresion. Remember that, as shown in our last 
section above, the derivative of the log of a quantity is the 
derivative of the quantity divided by the quantity, and notion 
that dy/dy = 1, 


we \ bony L a 
te, SCevora) tsigana a.t a 


Under our tople, Derivative of an Inverse Function, we showed 
that 
FAE, 
u du/dy 


1 
Z- wartime 
But our original equation was y = bat. Replacing the y above 
with this value from the original equation, we have 


TETT 


PATTE (Degiestive of a power forme) 08) 


‘Therefore 


Therefore the derivative of a constant raises) to a power whieh 
is a function of z is the constant raised to the original power 
times the log of the constant times the derivative of the z fumethas 
that constitutes the power, If additions! constants appear as 


but not themselves raised to a power involving 2, thee confieiente 
recur unchanged in the derivative. 
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Yy = 208-8) ; A = 20(7 2248) . (2,9957) (8a? — 4x) 


The 2.9957 is the log of 20 and the (322 — 4x) is the derivative of 
the original exponent. The derivative is a general expression 
for the slope of the curve y = 20°*-**+ for any value of x in 
which we may chance to be interested. Suppose we wish to 
find the slope of the curve where x = 1}. By substituting in 
the above differential expression, we shall find that dy/dx is 
precisely zero where x equals 13. That is, the line is precisely 
parallel to the z axis where z = 14. 

We have an especially simple case in this formula where the 
constant is e. This occurs so frequently that it will pay us to 
derive a general rule involving it. We need only put e in place of 
a in the generalized case treated above. 


But since we are working with logarithms to the base e, log e 
equals 1, since any log of its own base is 1. Therefore 


ae) =e": au (Derivative of the power form e") (9) 


If the function u should be simply x, the derivative would take 
a still simpler form, as follows: 

(Cy Oa 4 

Pre ae Ee eG l=¢é 
Thus we come back again to the queer and important fact that we 
discovered when we first met this quantity e, a few pages back, 
viz., that, when we differentiate e7 with respect to z, we get as our 
derivative precisely the same thing we had before differentiating. 


MORE PRACTICE IN APPLICATIONS 

After we had covered our first round of the simplest forms of 
differentiation, we paused to get some practice and to apply our 
techniques, so far, to finding maxima and minima. Now that we 
are through a second major cycle, let us again pause to make some 
applications. This time we shall take a fairly complicated func- 
tion with which to work, but one that plays an extremely impor- 
tant part in social and educational statistics. The reader will 
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need to watch his step in order to follow the process. But no 
new principles are involved beyond those treated in the preceding 
few sections. Indeed it is characteristic of calculus that its 
principles are simple but that its challenge consists in finding 
ingenious methods of analyzing the functions in question so 
as to put them into forms that are familiar; and in following out 
algebraic processes that sometimes become a little complicated. 
The function on which we shall practice here is the equation for 
the curve of a normal distribution. We shall later learn that 
this important statistical formula is 


zt 
e 28 


lao Ve 
ov 2r 
where W is the number of cases in the distribution, r is 3.1416, o 
is a constant for a given distribution with which the reader is 
cither already familiar or soon will be, y measures the height of 
the vertical ordinate at successive values of x, and x measures 
along the horizontal axis in terms of deviations from the mean of 
the whole distribution as origin. The meanings mentioned for 
these symbols are, of course, the ones customarily attached to 
them by mathematicians, and we are defining them here merely 
for the benefit of the lay reader. 
An inspection of our equation will show that it is of the form 
y = be", for which form the derivative was given in our last 
preceding section. The N/(o+/2m) is the constant that corre- 
sponds to b and the —2?/2c?is theu. In order to save complica- 
tions in our notation we shall carry along b for the complex value 
for which it stands. We shall rewrite the equation and then 
proceed with its differentiation. 


z? 


y= be 2 
dy _ p i ARRA eri (28) 
whence 
dy _ ab e wa zb z (First derivative of the normal (10) 
dx o 5 p ES curve function) 
o 


In this expression farthest to the right the e with its negative 


26 STATISTICAL PROCEDURES 


exponent could be transferred to the denominator by making its 
exponent positive, on the general principle that a = 1/a>. 
Figure 7 shows the normal curve. .The reader should give him- 
self some practice in interpretation by testing the significance 
of the above derivative with reference to it. Remember that 
the x distances are measured from the mean (center) of the dis- 
tribution as origin, plus to the right and minus to the left. 
According to the derivative, the slope should be plus (z.c., up) 
on the left side of the curve where x is minus, for here we have 
minus times minus values of x which should give plus. Does the 
behavior of the actual curve conform to that deduction from the 
derivative? On the right side of the curve the slope should be 


Fig. 7.—Graph of a normal distribution. 


negative (downward), for here we have minus times plus values 
of x. Does inspection of the curve bear that out? If we sub- 
stitute x = 0, we should have dy/dx = 0, for obviously the 
curve is horizontal (i.e., has a slope of zero) at the middle of 
the distribution. Try substituting zero for x in the derivative, 
and see whether dy/dx turns out to be zero. Conversely, we 
should be able to make up our minds that we wish to find the 
place where the slope should be zero and to find it by setting the 
derivative equal to zero and solving for z. Let us try. 


Clearing of fractions, then dividing through by —b, 
—bx = 0; therefore z = 0 


Thus we deduce from the derivative that the curve should be 
parallel to the x axis at the middle of the distribution, where 
z=0. 
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We might have divided our derivative equation by —bx 


instead of multiplying through by the oc, We would then 
have: ; 


i OTA 
oe? o?e’ 
taking reciprocals, 
bs 
os 1 
29262 — = 
oe? = — = o 
0 


Dividing through by o?, 


z zt 2 
e” = 


Since log e = 1, z?/20° = ~; 
m= o2? = o; = yo = Ło 

Thus the curve should become horizontal again at plus or 
minus infinity, as well as where x equals zero. Does inspection 
of the curve make that plausible? 

If we take a second derivative, we shall have an expression 
for the rapidity with which the slope of the curve itself is chang- 
ing with successive values of x. See whether you can verify 
the following as the derivative. For convenience we shall repeat 
the first derivative, then proceed to take from it a second deriva- 
tive. The reader must remember that we have the following two 
principles involved: we have the product of two variables and 
we have the form e". If necessary he should turn back to the 
discussion of the differentiation of these two forms. 


dy slr 


dx g? à 
a- (SAAG) 
= zi “es 4 bet i x be? an | ee ae 
_ Wa? = 0) -i 
a 
LN Go), ia (Second derivative of (44) 


5 .. the normal function) 
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By substituting in this expression different values of z, we 
could find the steepness of the slope for any value of x we choose. 
If we set the second derivative eqyal to zero, we shall find the 
value of x at which the change of zis a minimum. Let us try. 


ae ty Eee 
uae a) , 3? 


z 
Dividing through by be ”*/c4, 
a? —o? = 0; 27 = 6332 = +o 


So the point at which the curve is nearest a straight line is 
exactly one g each direction from the mean. That is the point 
where the curve stops bending inward and begins bending out- 
ward. Remembering that the whole distance from the mean 
to the place we have cut off the curve is about 2.50’s, does our 
finding look plausible? Try dividing through by the coefficient 
of e, and see whether you can find another point at which the 
change of slope is a minimum. 

Tf the reader has the necessary hardihood, he might try, on 
his own, to take a third derivative. He should find it to be 


day _ Nao’ — °) -i (Third derivative of the (79) 
dz? ot\/ 2m normal function) 


This is an expression for the rapidity with which the change 
of slope is itself changing for various values of z. If the reader 
will set this derivative equal to zero and solve for 2, he can find 
that point in the curve where it is bending most rapidly—where 
the tail begins rapidly to thin. If he wishes to verify the fact 
that at this point the speed of the bending is a maximum and 
not a minimum, he may take a fourth derivative and assure him- 
self that at this point the value of the fourth derivative is negative 
in sign. 
THE DERIVATIVE OF A SINE 

In the applications of calculus to statistics as presented in 
this volume we make no use of differentiation of trigonometric 
functions. Nevertheless, because this plays so large a part in 
the full treatment of the calculus, we shall carry the reader 
through one development—“just for fun.” If he does not care 
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to follow for the sake of getting a glimpse into this part of the 
calculus, he may skip this section. We shall find the derivative 
of the sine of an angle of value z. We go through our customary 
four steps. 


y = sin z; (1) y + Ay = sin (z + Az) 


Subtracting the original equation but merely indicating the 
subtraction on the right, 


(2) Ay = sin (x + Ar) — sina 
In trigonometry the formula is established that 
sin A — sin B = 2 cos $(A + B) sin }(A — B) 


The A may stand for our (x+ Az) and the B for our z. Apply- 
ing this theorem, we have 


Ay = 2 cos (x + Az + 2) sin (a + Az — x) 
Ay = 2 cos (+42) sin 


Dividing both members by Ax and rearranging the position of 
the 2 in a manner that will not change its effect upon the value, 
we have 


0 Bem(ert) Ca) 


We shall now let Ar approach zero as its limit. As Az 
approaches zero, the first factor on the right of the equation 
approaches cos 2, for the Ax/2 approaches zero and drops out. 
The second factor in parentheses expresses the relation of the 
sine of an angle to the angle itself. But, if the reader will 
visualize the relation of an angle to the sine of the angle, he will 
see that, as the angle becomes smaller, its sine becomes smaller, 
It can be proved that the ratio of an angle (measured in radians) 
to its sine approaches 1 as a limit as the angle approaches zero 
as a limit. As the limit is reached, therefore, the whole of the 
quantity in the second parenthesis would become 1 and we 


would have 


dy = cos g- 1 = cosx (Derivative of a sine of an angle) (18) 
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Thus the derivative of the sine of an angle is the cosine of that 
angle. Differentiation of the other trigonometric functions 
proceeds in a similar spirit. 


PARTIAL DIFFERENTIATION 


It frequently happens that a function contains two or more 
variables that are independent of one another; y = f(z, z, w), 
so that the total behavior of y is dependent upon the aggregated 
effects of all the three factors upon which it depends. Since 
these factors are independent of one another, we may find the 
differential relation of y to each term in succession by differentiat- 
ing with respect to it, while treating the others as constants. 
Thus in the case of y = f(z, z, w), we may differentiate first 
with respect to x with z and w regarded as constants, then with 
respect to z holding x and w constant, and finally with respect 
to w holding x and z constant. The total derivative would 
then be the sum of these partial ones. This process is called 
partial differentiation. Its several processes are identical in 
procedure with those of simple differentiation. However, we 
employ a different symbolism. Several different symbols are 
used, and out of them we shall choose those of the type D.f, 
the x standing for the variable with respect to which we are 
differentiating. Let us take an example. 


y= 2? +222 —32+2-6 
Differentiating first with respect to x while holding z constant, 


D.f = 2x — 3 
Next differentiating with respect to z while holding z constant, 
Df = 6241 


We shall apply this process of partial differentiation to a 
practical problem. A farmer wishes to make a zinc-lined tank 
to hold 62.5 cu. ft. of water. In what dimensions shall he make 
it so the amount of zine required shall be as little as possible? 
That is, with what dimensions will the sum of the areas of the 
bottom and the sides be a minimum? 

Let x equal the length of the tank and y its width. Then 
the area of the bottom will be zy. The volume is the area of 
the bottom multiplied by the depth, since we are taking the 
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tank to be a rectangular parallelepiped. That is, if d is the 


depth, 
62.5 = dzy, ord = Sum) 
ry 
The total surface (S) is the sum of the bottom surface plus 
that of the two sides plus that of the two ends. Therefore, 


62.5 62.5 
S = ty + 2x z; + 2y ia 


125 , 125 
AE E 


For convenience in differentiating this may be written, 

S = zy + 125y + 125r! 
Now the surface is a function of both the length and the width, 
and these two are independent of each other. We shall, there- 
fore, resort to partial differentiation, first holding 2 constant 
while we differentiate with respect to y, then holding y constant 
while we differentiate with respect to 2. 


Dif = 2 + (DU) = 2 — =P 
Def = y + (—1)1252-4) = y — 7 


The S is to be a minimum by reason of the effect both of the 
length and of the width. Therefore each of the two partial 
derivatives must be equal to zero. Making them so and solving 
the equations we get, v — 125/y? = 0. Clearing of fractions, 
and transposing, zy? = 125. Similarly y — 125/2*=0, so 
that z?y = 125. Since each is equal to 125, z°y = ay. Divid- 
ing through by zy, we have x = y. Substituting in x*y = 125, 
we have x? = 125, 2 = 5. Similarly yë = 125, so that y = 5. 
Thus the surface of the tank is at a minimum for the volume 
in question when the tank is 5 ft. long, 5 ft. wide, and 25 ft. deep. 


INTEGRATION 
So far we have dealt with the process of differentiation, which 
involves determining the relation of infinitesimal increments of 
one variable to infinitesimal increments of another. The second 
part of calculus, as customarily treated, deals with integration. 
This is the reverse of differentiation; it involves having in 
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hand a derivative and wishing to get back from it to the original 
function. At first sight it would seem that this should be easy; 
we would need only to retrace the steps that would have given 
us our quantity in hand as a derivative. That is precisely true. 
All we need to do in the process of integration is to recall what 
type of function gives us, when we differentiate it, a derivative 
of the type exemplified by the one we have in hand, then put 
down the original function as our needed integral. Only, it 
sometimes requires great ingenuity to recognize our quantity 
as a type of derivative with which we have dealt. Great ingenu- 
ity is exercised by mathematicians to put quantities, by algebraic 
manipulation, into forms that are familiar as derivatives. 

The symbol of integration is f. You customarily see it in 
such form as this: ff(z)dz. This f is really only an old form 
of the letter s, and the indicated integration may be thought of 
as summing together the infinitesimal increments represented by 
dz the number of times indicated by the remainder of the expres- 
sion, in this case f(z) times, whatever that f(x) may stand 
for. Let us take first our simplest cases. If we differentiate 
y =a, we get dy/dx = 4x3, If, therefore, we are given the 
expression [42*dz and are told to integrate it, we, understanding 
that that command involves the order to get back the function 
which if differentiated would give it, might guess that 


f4a'de = zt, 


We could have obtained this by raising the exponent of the 
x? by 1, making a fraction out of 1 over this increased exponent, 
and multiplying the coefficient of the quantity to be integrated 
by this fraction. Thus 


1 
3, = (3+1) — 4 
fe da 341 4a T 


That, you see, is exactly the reverse of what we do when we 
differentiate an expression of this type. For when we differ- 
entiate, we diminish the exponent by 1 instead of increasing it, 
and we multiply by the original exponent instead of dividing by 
it. So integration is the reverse of differentiation. Let us take 
the more general case, and, treating it for the present just as we 
did above, see what is involved in going from the function to the 
derivative and then back again to the function, 
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d 
y = ax"; z = ang™! 


fa = f (anx")dz = E a Ta grit] = ag” 


Our rule, then, for integrating a function of the form az” 
seems to be to increase the exponent of the x by 1, raising it to 
(n + 1), and to multiply the coefficient of the x by 1/(n + 1). 

But let us remind ourselves of the behavior of an independent 
constant. Differentiate y = az" + b: dy/dx = anz"™!. Inte- 
grating this by the above rule we get 


fa = f (ana"™)dx = E DA ia yir =ar 


The b which belonged to the original function has been lost. 
That will not do. As a matter of fact, when we integrate, we 
can never know whether or not there should be in our integral 
an independent constant. So we take no chances; we add a 
constant, calling it C. If, then, the C turns out to be of zero 
value, its inclusion has at least made us safe. This C is called 
the constant of integration and should always be added when 
integrating. So our full integral of the above function would 
be 
Jay = S(ana-)dz = ax" + C 


Recall, now, how we differentiated y = ax" + bx? + cat + d: 


a = naz"! + pbx? + gexet 


Going back from this derivative to the original function we 
would have 


na A pb f qe 
fa- e REES am rI ae 
But each of these parts is precisely the integral of the correspond- 
ing part of the function being integrated, so that we have 
dy = Jnaxdx + Jpba?da + Jagex? dx + C 


In other words, the integral of the sum of any number of func- 
tions of z is the sum of the integrals of those functions. 
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We shall have now a few simple exercises in integration, so 
far as we have yet carried our principles. We shall complete 
some of them and leave others for the reader to complete. Note 
that a dx accompanies each, which indicates the variable with 
respect to which we are to integrate. 


S9a%dx = 3z? + C 

fizdx = 4? + C 

{(Bx? — 3)dx = 2° — 3r + C 

J(8a* + 6x? — 9x*)dx = $x" + $r — 82° + C 
J82%dx = 

J(6e5 — 4x)dx = 

S(T? + 4r + 382-*)dr-= 


STANDARD INTEGRAL Forms 

It is unnecessary for us to take up for detailed discussion 
each of the types of functions as we did under differentiation. 
It will be enough to place in a list, below, a few derivatives on 
the left and their integrals on the right. The reader will recog- 
nize that, if the differentiation of a certain type of function 
yields a certain type of derivative, then, by reason of the meaning 
of an integral, the integration of the type represented by the 
derivative will yield the integral function. Mathematical 
workers depend heavily upon such lists of standard integrals, 
referring to them to find the type involved in their problem and 
from this writing out the integral. If nothing has ever been 
differentiated that yields a particular type of derivative, then 
it is impossible to integrate that type of function—of which, 
however, there are very few in applied mathematics. Full 
texts in calculus, as well as some books of mathematical tables, 
give extensive lists of standard integral forms, while we give 
below only a very few. 


fen- 27 +0 
n+1 
Uy = a 
feu- S540 
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fed ale +e 
HG 
J tog 2 de = 2log2—2+0 


ie -ivere+e 


f sn za = — cost +0 

J cose de = sin +€ 

f tanzae = — log cosa + C 
yf cot 2 de = tog sin x + C 


ETT, 


sin 2 cos & 


fierste 


d: 1 
fni eetw+e 


APPLICATIONS OF INTEGRATION 


Area under a Curve.—There are a number of applications of 
the integral calculus to two of which we shall give particular 
attention at this time. The 5 Q 
first of these is finding the area PETR 
under a smooth curve of which 
the equation is known. 

Let u be the area bounded 
by the curve of which the 
equation is y = az”, by the « 
axis, the fixed ordinate DC, C MN 
and the variable ordinate MP. cane 
Evidently as the distance CM varies the area CMPD will vary. 
That is, as x takes on an increment, the area u will take on an 
increment; so that wis afunction of. Inspection of the diagram 
will show that 


Area MNRP < area MNQP < area MNQS 
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But area MN RP is equal to MN times MP, and area MNQS is 
equal to MN times NQ. The MN is a variable distance to 
be added to CM which we may call Az, and the area MNQP 
is a variable area to be added to u which we may call Au. Mak- 
ing these substitutions we have 


MP: Az < Au < NQ- Ax 


Dividing through by Ax we have 


Au 
MP < Tee NQ 
If now we let Az approach zero as a limit, Au/Az will become 
du/dx, MP will approach NQ as a limit, and this limit will be y, 
the vertical ordinate of the curve at the point under consideration. 
Thus in the limit, du/dz = y. 

But fdu = u. Therefore fy dx = fax*dx = u. 

This shows that we can get areas under a curve by integrating 
the equation of the curve. 

But, since z has successively different values, we must always 
find the integral up to a certain value of x. When we substitute 
a value for the z, we have the area under the curve from the 
origin (zero) up to that point. If we desire to find the area 
between limits neither of which is zero, we shall need to find the 
area up to the higher limit, then to the lower limit, and to take 
as our required area the former minus the latter. This we do 
whether the two points between which we desire to integrate lie 
on the same side of zero or on opposite sides; the process of 
algebraic subtraction will take proper care of signs. 

Let us use again our curve on page 13 and suppose we can 
have no negative y values. The limits of our curve will then 
be x = 0 and x = 8, and we want to find the total area under 
the curve between those two limits. Our formula for the curve 
was y = 82 — xz. The integral of this is 


Sy da = J (8% — #)dx = 4r? — 4r? + C 
Substituting for x its upper limit value x = 8, we get 


4x? — ga° +C = 4-64 — 3-512 + C = 256 — 1703 +C 
= 853 +C 
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We must now substitute for the lower limit, which is x = 0. 
But, when we substitute zero for x, we get for our integral 
merely C. Subtracting the upper value from the lower one, the 
C’s cancel out so that our whole area is 854. 

Suppose we wish to find the area of this curve up only to 
where z = 3. We substitute 3 for z in the integral and get 


4°32?-4-39+ 0 =36-94+0=274+0C 


When we substitute zero for x, we get merely C in our integral. 
taking the difference between the values at these two limits, we 
have (27 + C) — C = 27. 

Suppose we wish to find the area of the curve between the 
points where x = 5 and where x = 6. 


[va = fe (8z — x?)dx — JR (8z — 2)dx 
= (4-6 — 4:6 +C) — (4-5: -24-5+0C)4+C-C 
= (72 + C) — (583 + C) = 133 

The C disappears in the process of subtraction. The C always 
disappears when integrating between limits because always there 
is involved subtraction with the C appearing in both minuend 
and subtrahend. 

The Equation of a Curve.—The second application of integra- 
tion is in finding the equation for a curve when the slope of the 
curve is known. Thus, in developing the formula for the curve 
of a normal distribution we first obtain an expression for the slope 
at any point, x. How shall we get from this information the 
equation for the curve itself? If we had the equation of the 
curve, we know that we would need to differentiate it in order to 
get an expression for its slope. Obviously, therefore, if we have 
the expression for its slope, we need to employ the converse 
operation of integration in order to get the equation for the curve. 

There are other types of application of integration, such as 
finding the length of a curve or the area between curves, and 
many of the type that involves summing elements that approach 
zero in size but of which the number bears a reciprocal relation 
to the size. Such types as finding the length of a curve or the 
area between curves would deserve elaboration here except for 
the fact that for our present purpose we do not need to draw 
upon them. For the type that involves summing infinitesimals 
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into a whole, we shall have considerable use, but the application 
follows so obviously from the basic meaning of integration as 
not to require discussion. We take occasion at a point of 
application in our chapter on Measurement of Variability to 
develop the important Taylor series, and consequently refrain 
from developing it here. 


INTEGRATION BY PARTS 


A device to which mathematical workers often resort when 
direct integration is difficult is integration by parts. You 
remember that, when we differentiated a product, we got the 
following: 

dw) _ 
E 


dv du 
tat Oe 
By transposition we may write this 


dy _ alw) _ pte 


“de dx dx 


When integrated this becomes 
Judv = fa(w) — fvdu 
But fd(uv) is equal to wv. Therefore 
Judy = w — fudu (14) 


In order to show how we may employ this combination of 
parts where we cannot integrate directly, let us take the follow- 
ing example: fdy = Jxze"dz. We may take u = x and dv = edz. 
Then du = dx and fdv = v = feds. But we know the integral 
of e*dx; it is simply e”. Substituting all of these values in our 
Eq. (14) above, 


Jaetdx = ze — e = (x — ije + C 


We may sometimes integrate by parts several times in succession 
or may employ the formula resulting from the product of three 
or more factors instead of two as illustrated in this example. 


SUCCESSIVE INTEGRATION 


Just as it is possible to differentiate a number of times in suc- 
cession (“partial differentiation”), so it is possible to integrate 
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any number of times in succession, either with respect to the 
same variable or with respect to different variables. In the case 
of those functions ‘which are commonly encountered in practice, 
we can integrate in any order we please where we are integrating 
each time for a different variable, just as was the case with 
differentiation. The expression at which we have arrived at 
the climax of any process of integration constitutes the point 
of departure for the next integration. No new principles are 
involved, although usually expressions become more complicated 
with added steps in successive integration. 


CHAPTER II 
MEASUREMENT OF CENTRAL TENDENCIES 


PREVIEW OF STATISTICS 


The General Nature of Statistics—The student should realize 
from the beginning that there is nothing magical or occult or 
especially difficult about statistics. The task of the statistical 
worker is merely to describe succinctly a set of measurements 
or “variables,” or the relations between sets of variables. As 
long as we have only small numbers of cases with which to deal, 
we can get along very well by describing them one by one, or 
our comparisons pair by pair. We may say about a group 
of three boys, for example, that John weighs 112 lb., Sam 123, 
and Charles 135. We may say, further, that Charles weighs 
more than Sam in spite of the fact that the former is older. 
But if we have 1,000 boys to describe, or 100, or even 20, we 
cannot talk about them thus one by one; to do so would require 
too much time. We are obliged, therefore, to adopt some 
more compact*method of description that will tell the truth 
succinetly, yet do justice to the group. So we describe the weight 
in terms of an average and an expression for variability, and the 
closeness. of relation between weight and age in terms of a 
coefficient of correlation. 

The Tasks of the Statistician.—In giving an adequate descrip- 
tion of a mass of quantitative data, we shall need to do one or 
another, or several, of the following things: 

1. Mention some representative number to indicate the general 
size of the variables—a mean, a median, a mode, or other index 
of “central tendency.” The popular term for this is “average.” 

2. Indicate how widely the variables are spread—how much 
they differ from one another. As measures of such variability 
we have average deviation, range, percentiles, etc. 

3. Show the shape of the distribution. The frequency 
polygon resulting from the distribution of variables may be 


rectangular, or bell-shaped, or skew. If bell-shaped, it may be 
40 
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highly peaked up in the middle (leptokurtic), or rather flat 
(platykurtic), or moderately peaked (mesokurtic). Measure- 
ments taken in connection with time trends, or summated 
measurements, may fit parabolas, or sine curves, or other types 
of regular or irregular trend curves, and we may wish to measure 
the goodness with which these curves fit the data. 

4. Show the relation of two or more sets of variables to each 
other. Where we wish to show the relation of the sets to each 
other as wholes, we may indicate the percentage of overlapping 
or the difference between the means or the comparative variabili- 
ties. Where we wish to show the degree of parallelism between 
the corresponding measurements in different distributions, we 
may resort to coefficients of correlation. 

5. Indicate how dependable our -generalizations are (our 
means, standard deviations, coefficients of correlation) by 
showing how much they must be expected to change with further 
sampling. This is the problem of reliability. 

6. Translate the variables with which we are working into 
forms that have a standard meaning, just as people long ago 
came to translate measures of distance or of weight into a few 
standard forms such as foot, meter, pound, or gram. 

The whole of applied statistics is comprehended under the 
above six types of functions. The student will do well to 
keep the details of his work in statistics in this perspective. In 
this chapter we shall discuss the first of these tasks—measuring 
central tendencies. This has to do with giving a picture of the 
general size of the scores (the variables). There are several 
measures of central tendency which we shall take up in turn: 
arithmetic mean, median, mode, geometric mean, and harmonic 


mean. 


THE ARITHMETIC MEAN 
Definition and Formula.—The mean is that point in a distribu- 
tion of scores around which the moments! are equal. It is well 
pictured by a seesaw. The fulcrum of a seesaw must be so 
placed that the moments on one side exactly balance those on 


1 Here we are employing the term moments in the sense in which it is used 
in physics in connection with rotary momentum. The term is also used 
in a different and more technical sense in statistics to designate the power 
to which deviations are raised before averaging them. 
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the other, The mean is popularly called the average, although 
in technical statistics the term average is employed for any 
measure of central tendency. 

The reader has doubtless long thought of the mean as the sum 
of the scores divided by the number of scores. That is correct, 
but its truth follows as a corollary from the more general concept 
of equalized moments stated above. We shall first concretely 
illustrate this correspondence and then give a generalized proof 
for it. Consider the series of numbers 17, 13, 8, 7, 4, 4,3, 2,1, 1. 
The sum of the scores is 60, the number of scores is 10, and the 
mean 6. Four of the scores are above the mean and six of them 
below. The moments above are given by the deviations of the 
four high scores from the mean and are 


(17 — 6) + (18 — 6) + (8 — 6) + (7 — 6) 
=114+7+2+1= 21. 
The moments below are 
(6 — 4) + (6 — 4) + (6 — 3) + (6 — 2) + (6-1) + (6—1) 
=2+24+3+4+4+4+54+5=21. 
Thus the sum of the moments above the mean equals the sum 
of those below the mean. 

Now for the generalized proof. Let the scores above the mean 
be represented by a, b, c, d, ... , k and those below by p, q, 
T, ... , Z; let the mean be M, the number of scores above the 
mean s and the number of scores below the mean t, the whole 
number of scores, s +t, being N. Then, since the moments 
around M are to be equal 


(a—M)+(b-—M)+(e—M)+--- + (k — M) 
=(M-p)+ (M-49) +: +M -z2) 
But the M occurs in the scores above the mean s times and in the 
scores below the mean ¢ times. We may separate out the recur- 
rent M’s and have the following equation: 
(atb+e+:+-+:+hk)—su 
=(M—(p+q+r+--- +2) 
Transposing, and multiplying both sides of the resultant equa- 
tion by —1, 
(s+)M =(a+b+ce+--++h) 
+ptgqtr+:---++2) 
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SOS Di its ah ese ne Che ime mip a Nant 
Sirf tate 


But the numerator of the fraction on the right side of the equation. 
is the sum of all the scores, while the denominator is the whole 
number of scores, N. Therefore M, the mean point about which 
the moments above and below are equal, has also as its value 
the sum of the scores divided by the number of scores. We 
may, therefore, regard as equivalent definitions of the mean 
(1) the sum of the scores divided by the number of scores and 
(2) the point in the distribution around which the moments are 
equal. The reader will soon see that the latter is the more 
illuminating definition. 

The Mean of Grouped Scores.—We may hold on to definition 1 
a little longer so as to apply it to grouped scores. In the dis- 
tribution of Table I several of the scores are of the same size. 
The frequencies of these similar scores are shown by tallies in 
column 2 and by Arabic figures in column 3. We could, of 
course, find the mean of the distribution by adding, one by one, 
all the 103 scores, regardless of the fact that there are duplicates, 
and dividing by the number of scores. But we have available 
multiplication as a foreshortened form & addition; it is far more 
economical to multiply score 9 by 23, for example, and add the 


M 


Taste I—Scores IN HANDWRITING on THE THORNDIKE SCALE 


a) (2) 
X (score) f (frequency) f Jx 
14 2 28 
4 52 
6 72 
10 110 
20 200 
23 207 
19 152 
13 91 
5 30 
1 5 
A O A A aA 103 947 
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product to the other moments than it is to add in the nine 
separately 23 times. We, therefore, multiply each score by its 
frequency, as shown in column 4, then add these products. 
The sum of these moments will obviously be the same as that of 
the scores added separately, and this sum divided by N will 
give the mean. Thus the formula will be! 


eases 


N (Arithmetic mean) (15) 


Mean of Scores Grouped by Intervals.—If our scores are many 
and are widely spread, we cannot conveniently group them by 
individual scores; we find it more convenient to group them by 
intervals with a range of more than one unit. In Table II the 
scores are grouped in intervals with a range of 5 and with fre- 
quencies shown by tallies and then by Arabic numbers. Interval 
149.5-154.4, for example, contains all the scores that have values 
between 149.5 and 154.499 . . . , just short of 154.5 but not 
including 154.5. All these scores may be thought of as centering 
around the mid-point of the interval in which they fall, which is 
152. Similarly the scores in each of the other intervals may 
be thought of as centering around the mid-points of the intervals 
as shown in column 4 of the table. We may, therefore, get the 
average of the distribution by multiplying each of these mid- 
values by the frequency of scores in the corresponding intervals 
and by dividing by N. Intervals may be of any convenient 
length, but we usually like to make them of such length as to 
give from 12 to 18 intervals for a distribution. A smaller 
number will, however, do little harm when central tendencies 
are being calculated. A favorite length of interval is five or 
ten score points if this gives a number of intervals anywhere 
near what is desired. The interval ought normally to begin 
with a multiple of its unit of length. Thus, if the interval is 
three score points in length the initial number of each interval 
should be a multiple of three. The way in which to find the 
mid-point is to add the initial numbers designating the two 
successive intervals and divide the sum by two. 


1 The symbol X means that we are to sum the variable following it (fz). 
Dis the Greek capitalsigma. Some writers in statistics, especially those who 
follow recent practice in England, use S instead of © as the summation sign. 
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Tasim II.—Epucationat Aces or 109 Purs Expressep IN Monrus 


(1) (2) (3) (4) (5) 
Educational EAT ennah X (mid- 4X 
age q y point) 
179.5-184.4 1 1 182 182 
174.5-179.4 tt 111 8 177 1,416 
169.5-174.4 ir 1 6 172 1,032 
164.5-169.4 at 5 167 835 
159. 5-164.4 1111 4 162 648 
154.5-159.4 Att Ht 111 13 157 2,041 
149.5-154,4 1H 4+ tt 1 16 152 2,432 
144.5-149.4 Att Ht 111 13 147 1,911 
139.5-144.4 | Litt LH itt 
Ht 11 22 142 3,124 
134,5-189.4 | HHT 4 11 12 137 1,644 
129.5-134.4 it 1 6 132 792 
124,5-129.4 LLY, 3 127 381 
POtHIS eter ral S cs ssian ies si 109 Sa} 16,438 
16,438 
M= 109 150.8 


A Mean from a Guessed Mean.—Falling back now on defini- 
tion 2, we recall that the moments above a true mean are exactly 
equal to those below. We might, therefore, find the true mean 
by trying one point after another until we get one that gives 
equal moments on both sides. Of course, no one would do so 
foolish a thing in practice. Nevertheless, odd as it may sound, 
statisticians almost always (unless they are working with a 
calculating machine) approach the calculation of a mean by 
guessing the mean and then correcting the guess by an arith- 
metical adjustment of such sort as to balance the plus and 
minus moments. It is ordinarily much the easiest way. Sup- 
pose the true mean of a distribution is, as the calculator is later 
to learn, M. But not yet knowing this, he guesses the mean at 
M,. Let the amount by which his guessed mean differs from the 
true mean be represented by c. Then, if x is the deviation of a 
score in the distribution from the true mean and 2’ is its deviation 
from the guessed mean,! z= 2x’ —c. Summing the devia- 

1 The term conventionally employed for a deviation from an assumed 


mean is the Greek letter £. But we are avoiding it because the novice finds 
it unfamiliar and difficult to write. Besides, the best statistical practice is 


` 
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tions for all the scores, 
Ze =)2a! — Be 


But c is the same for each of the N scores, and Dz (the sum of 
the deviations around the true mean) is, by definition of a mean, 
equal to 0. Therefore, 


Zz’ — Nc =0 
Transposing and dividing through the equation by —N, 
ee 
N 


The amount by which the assumed mean missed the true mean 
equals the algebraic sum of the deviations from the assumed 
mean divided by the whole number of scores. Thus we may 
guess a mean at any point we please, compute the deviations 
from this point summed and divided by N, and add this quotient 
to our assumed mean to get the true mean. Our formula is, 
then, 


J 
M=M,+c=M,+ 2 (Mean from a guessed mean) (16) 


Application of the Formula.—This procedure may be applied 
to grouped or to ungrouped data. Let us consider first ungrouped 
data. Suppose you are finding the mean of grades for your class 
as listed in your record book. You may look them over in a 
general way and decide upon a suitable one as a guessed mean. 
Then begin at the top of the column and add mentally (alge- 
braically) the deviations of the scores from this assumed mean, 
divide the excess by the number of students, and add this 
quotient algebraically to the assumed mean. The result will 
be the true mean, 

When scores are grouped, as in Tables II and III, the principle 
is equally applicable. Let us take the more complex case, 
Table III. We assume a mean anywhere we please, say at the 
mid-point of interval 145-149. We always set the assumed mean 
at the mid-point because we want to regard the measures in the 
several intervals as centered around the mid-points. We must 


to reserve Greek letters for “true” values, and this use does not come in that 
class. 


5 
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now take the deviations from this assumed mean and multiply 
them. by their corresponding frequencies. Each measure in 
interval 150-154 deviates +5 from the assumed mean, each in 
155-159, +10, ete. But let us not carry along these big numbers, 
5, 10, etc.; let us work in terms of intervals. Everything cen- 
tered about the mid-point of interval 145-149 deviates 0 interval 
from the assumed mean, everything about the mid-point of 
interval 150-154 deviates 1 interval from the assumed mean and 
in the other intervals by the number of steps indicated in column 3. 


Tasim II.—Epucarionan Aces or 109 Purs Expressep IN Montas 


a) 

Educational age fi a! fo! 
180-184 1 7 7 
175-179 8 6 48 
170-174 6 5 30 
165-169 5 4 20 
160-164 4 3 12 
155-159 13 2 26 
150-154 16 1 16 
145-149 13 0 0 
140-144 22 -1 —22 
135-139 12 —2 —24 
130-134 6 -3 —18 
125-129 3 —4 —12 

Totals 109 +83 


c= A = 0.76. 0.76 X5=3.8. M = 147 +3.8 = 150.8 


We compute our moments as in the earlier exercises of this 
chapter and algebraically sum them according to the formula. 
But when we have found our ¢, it is in intervals, since that is the 
unit with which we have been working. An interval is, in our 
particular problem, 5 scores wide; hence a ¢ of 0.76 interval equals 
a cof 3.8 score points. Add this correction to our assumed mean, 
147, and we have 150.8 as the true mean, which is the same as 
we got before. Our formula is, thus, 


H; PAER 
M = M, +? 


The f in the formula is merely a “symbol of operation”; the 
formula would mean exactly the same if it were not there. 
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The f merely indicates that we have foreshortened our additions 
by resorting to multiplication by frequencies wherever there were 
several scores of the same value. 

The worker will find the guessed mean method a very con- 
venient method. It is customarily called the short method. 
Not a bit of accuracy is lost by it; the mean will turn out to be 
precisely the same no matter where the guessed mean is taken. 
In fact the method of finding a mean by adding the scores and 
dividing by N may be regarded as a form of guessed-mean 
method, the assumed mean being at zero. 

The Meaning of a Score; Discrete versus Continuous Series.— 
When we compute means we are confronted with a difficulty 
about the meaning of scores. What does 6 mean? Does it 
mean just 6 or anything from 6 to a trifle short of 7? Or does 
it mean from 5.5 to 6.5? When there are 6 boys in a crowd 
there are just 6 and no fraction. Similarly a gun has fired just 6 
times, a student has finished just 6 problems, a player has hit 
the ball just 6 times. But if a boy is reported to be 6 years old, 
that may mean anything from 6 to a trifle short of 7, or it may 
mean approximately 6—anywhere between 53 and just short 
of 63. The same is true of 6 miles, 6 hr., 6 Ib.—where the report 
is so crude as to mention only whole numbers, Some data must 
be measured in terms that necessarily involve only whole num- 
bers; there cannot in the nature of the case be fractions. Such a 
series of measures is said to be discrete. Other data involve no 
real breaks; each degree passes by infinitesimal gradations into 
the next. Such series are said to be continuous. 

Now a discrete number should afford us no difficulty when 
computing a mean, except that the mean itself must be regarded 
as merely symbolic. Each number is exactly what it purports 
on the surface to be—a 6 is 6.000 and nothing else. We may 
add these numbers as they stand, divide the sum by N, and have 
an unequivocal mean. But if the measured series is continuous 
and our reports are crudely put in whole numbers, then our 
numbers must all be taken as stretching through a whole unit. 
Indeed the same thing is true even when. we give our measure- 
ments in terms that add some decimals; the last decimal covers 
a stretch through the unit of its order. Our confusion is made 
worse by lack of uniformity in indicating the direction in which 
this range spreads from the value named. In some cases the 
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score named stands at the mid-point of the range designated by 
it—eight years old means nearer eight than any other whole 
number, somewhere between seven and a half and just short of 
eight and a half. At other times the score stands for the value 
just across the lower margin—eight years old means from 
just eight to barely short of nine. With the former meaning, a 
number of eights, such as one would find in a frequency table, 
would tend to average exactly eight; but in the latter meaning, 
the average for the group would tend to be eight and a half. 
Evidently our measures of central tendency of a distribution of 
scores will give us different results under the two interpretations. 

Both Kelley and Holzinger recommend that, unless the 
evidence in the data clearly indicates otherwise, we take all 
numbers in the former sense—as standing at the mid-point of 
a range from a half unit below to a half unit above. Thus 8 
would cover from 7.5 to just short of 8.5 and 163.796 would cover 
163.7955 to just short of 163.7965. That, then, involves taking 
roughly given numbers at their face value. But it affects the 
limits of intervals in a frequency table and makes necessary a 
form of tabulation different from that customarily advised in 
most of the elementary textbooks on statistics. For an 
interval must start where the range covered by a number starts: 
a half unit below the designated number. For purposes of 
tabulation it is safest to indicate the limits of the intervals in a 
way that makes this clear, as illustrated at the left below. How- 
ever, if only whole numbers are involved in the tabulation (or 
whole units in respect to the digit farthest to the right in the 
interval designation), no confusion can occur from the simpler 
tabulation illustrated at the right provided one uses the correct 
mid-point and remembers when computing a median that the 
interval begins a half unit below the value indicated by the initial 
number designating the interval. 


(1) (2) 


Correct Way to Designate A Simpler Form of 
the Limits of Intervals Designation Usually 
When Number at Mid-point Satisfactory 
19.5-24.499 20-24 
14.5-19.499 15-19 
9.5-14, 499 10-14 
4.5- 9,499 5-9 


—0.5- 4,499 0-4 
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If, however, the context clearly indicates that the number 
stands for a range at the lower margin of which it is placed, this 
meaning should be followed in the tabulation, the mid-point 
should be determined accordingly, and the interpolation (which 
we shall shortly find to be involved in the computation of a 
median) should take as the beginning of the interval the value 
of the initial number rather than a half unit below that value. 
This is the method to which the reader has probably already been 
introduced in the more elementary books on statistics. Its 
interval limits may be indicated by either of the two following 
methods: 


(1) (2) 


Correct Way to Designate A Simpler Form of 
the Limits of Intervals J Designation Usually 
When Number at Lower Margin Satisfactory 
20.-24.999 20-24 
15.-19.999 15-19 
10.-14.999 10-14 
5.— 9.999 5- 9 
0.— 4.999 0-4 


In this latter case the mid-point of the interval can most 
easily be found by taking half of the sum of the initial numbers 
of two successive intervals, and the same technique will hold 
for the arrangement on the left in the former case. But for 
the arrangement on the right in the former case we must obtain 
the mid-point by taking half the sum of the initial and final 
numbers designating the limits of an interval. 

Tf this latter method (customary in most elementary texts) 
is used in finding a mean from a frequency distribution, the mean 
will not tally with that obtained by adding the individual scores 
and dividing by their number unless either the mean obtained 
from adding the ungrouped scores is increased by 0.5 or that 
obtained from the frequency table has been diminished by 0.5. 


THE MEDIAN 


Meaning.—The median is the mid-point in a distribution—the 
point above or below which lie an equal number of cases. Note 
that, while for a mean the moments above and below must be 
equal, for a median the number of cases above and below must be 
equal. These two conditions are not identical except in a 
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perfectly symmetrical distribution, and sometimes the mean 
and the median of a distribution may differ from each other 
considerably. 

Computing a Median.—For the computation of a median the 
scores must be arranged in regular ascending order according to 
size, or at least they must be thought of in such order. Our 
median lies halfway up through the series; i.e. , 4N units from the 
beginning of the series. Suppose we have the ungrouped scores 
2, 4, 5, 5, 7, 8, 8, 9, 9,9. The number of items is 10, so that we 
must use up 5 scores—go to the end of the fifth seore—for the 
median. The end of the fifth score is the upper limit of score 7, 
the value of which is 7.5. It happens that the next score in the 
series, 8, begins at 7.5, as stated above, so that to locate the 
median at 7.5 puts just half the scores below and half above. 
But if there had been a gap (say the next score had been a 9), 
we would make a rough adjustment by placing the median at the 
mid-point between where score 7 leaves off and where score 9 
begins; that is halfway between 7.5 and 8.5, which is 8.0. If 
the number of items is odd, the median will fall at the middle of 
the range of the digit and will have the value k.0, where k repre- 
sents the mid-score. But medians in small samples are usually 
very rough statistics, and we cannot afford to be very finicky 
about such nice adjustments as have just been mentioned. 

We may illustrate the computation of a median from grouped 
data by Table III. We accumulate our frequencies up through 
as many intervals as we can without exceeding +N. When we have 
reached the bottom of an interval that, if included, would more 
than exhaust 4N, we interpolate within that interval to locate 
our point; i.e., we place it such a proportional part of the distance 
through the interval as the cases yet remaining from 4N bear 
to the whole number of cases in the interval. 

In Table III, 482 = 54.5 scores from the beginning. 


3 + 6 + 12 + 22 = 43 scores, 


which carries us to the beginning of the interval 145-149, 54.5 — 
43 = 11.5 score points yet to go. Since the total frequency in 
this interval is 13, we must go 11.5/13 of the distance through the 
interval to find the median. As we have interpreted the mean- 
ing of scores here, the lower margin of the interval lies at 144.5. 
Therefore, 
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Mdn. = 144.5 + (es a -5) = 144.5 + 4.4 = 148.9 


Mid-score versus Median.—If one is seeking the middle score 
of a series, the formula for finding it is 


Nica 


Mid-score = m= 


Thus, if there are 13 scores in a series, the middle score is the 
seventh one, which would be found from the expression 


(13 + 1)/2 = 


At first sight it would seem as if there are 6 scores above the 
seventh score and 6 below and that the seventh should, therefore, 
be the median. But there are six scores below where the seventh 
score begins and 6 above where the seventh ends. The mid-score 
is, thus, a saddleback that stretches through an appreciable 
interval. The median, on the other hand, is a point in the dis- 
tribution that separates the upper 50 per cent of the scores from 
the lower 50 per cent. This point is halfway between the open- 
ing value of the seventh score and its closing value. What 
these end-values are, if strictly interpreted, should be determined 
according to the principles laid down in the paragraph above 
regarding continuous versus discrete series. However, in prac- 
tice, since a median from ungrouped data is a very rough meas- 
ure, we ordinarily treat our numbers as if they were discrete— 
as if the mid-point of score 7 were just 7. The statistical worker 
will seldom wish to deal with mid-scores in contrast with medians; 
hence he will have little use for the formula for mid-score given 
above. 
‘ THE MODE 

Meaning.—The mode is the score that occurs with the greatest 
frequency. To use it as a measure of central tendency is to 
employ a standard analogous to the determination of group 
evaluations by means of a plurality vote. In crude statistical 
work we pick out the mode merely by inspection. In fact we 
seldom employ the mode in psychological and educational sta- 
tistics in any more refined form than this rough inspectional 
one. But there are more precise methods for finding the mode 
if one’s problem justifies paying the price of employing them. 
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One of these more precise methods is to “smooth” one’s dis- 
tribution by averaging the frequencies in adjacent intervals and 
continuing thus to average until the irregularities of the distribu- 
tion have been sufficiently ironed out to make one interval 
stand out in frequency above the others, which interval will 
then contain the mode. Of course, if the distribution is bimodal, 
peaks will appear at two places, or more than two if the distribu- 
tion is multimodal, and the smoothing process dare not go so far 
as to cover up legitimate multimodality. The most precise 
method of dealing with the mode involves, however, the use of 
higher mathematics. It re- 
quires determining the equation 
of the best fitting curve and 
calculating the value of the x 
variable in the equation for that 
curve when the y variable is at 
a maximum. In Fig. 9 the 
frequency polygon of a set of 
scores is indicated by a broken 
line and the best-fit curve by a 
solid line. The equation of this 
curve has been found to be, we 
shall say, y = 8x — 22. Now, 
to find the mode, we must find 
the place along the x axis where 
the y will have the maximum 
value. To do this, we differentiate! the equation and set the 
derivative equal to zero. Thus differentiating, we obtain 
8`— 2z = 0, or =4. The mode is, therefore, exactly at the 
point where x = 4. 

Although the inspectional mode is the crudest of our measures 
of central tendency, the mathematical mode is the most easily 
determined with exact precision when once we have the equation 
for the curve of our distribution. Most empirical distributions 
will, of course, require best-fit curves with much more complex 
equations than the above hypothetical one, but we have been 
concerned merely to give the reader a glimpse as to how a 
mathematical mode is computed. The practical worker in 
our field of statistics will have little or no occasion to compute 


1 See preceding chapter on Calculus, pp. 10-15. 


Fia. 9.—Frequency polygon and curve 
of best fit. 


54 STATISTICAL PROCEDURES 


modes mathematically for empirical distributions. If one did 
have such necessity, he would probably wish to resort to a ready- 
made scheme that Karl Pearson has worked out, which would 
give good enough results for most purposes. By using the sort 
of procedure described above, he found that the mode of his 
“type III curve,” which is representative of a wide variety of 
moderately skew curves, is given by the following formula: 


mean — median 
Mode = mean — s 


where c, although differing slightly for different distributions, is 
given approximately by the following equation: 
0.0846(M — Mdn.)? 


c = 0.3309 — a? — 9(M — Mdn,)? 


in which o is the standard deviation of the distribution—a 
measure we shall discuss in our next chapter. It is obvious that 
the numerator of the fraction in this last equation will be very 
small compared with the denominator; hence the whole fraction 
will have a value so small that it may be neglected. c will, there- 
fore, equal approximately 0.33, which is about 4. Substituting 
+ for c in the basic equation above, we have 
(Mode computed from 
Mode = M — 3(M — Mdn.) the mean and the (17) 
median) 

Thus the mode of a distribution may be computed from a 
knowledge of the mean and the median. 


OTHER MEASURES or CENTRAL TENDENCY 


Geometric Mean.—Two other measures of central tendency 
are used occasionally, though little in our type of research, 
which we shall notice in passing. One of these is the geometric 
mean. It is used where the successive terms differ by a constant 
ratio instead of by an addend. It is the mean, therefore, of an 
exponential series. Growth in school population in the United 
States from 1900 to 1925 illustrates fairly well such a series; 
the numbers attending increased with a fairly constant accelera- 
tion, so that the trend line depicting the numbers curves con- 
tinually upward, like that of money at compound interest. To 
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find the geometric mean, we must take the Nth root of the 
product of our measures. Thus 


G.M. = Yi mr ta °° ay (Geometric mean) (18) 


But ordinarily the only feasible way in which to compute a 
geometric mean is by the aid of logarithms. Pf 


log G.M. = Š (log xı + log z2 + log za + + + + + log zy) 


The geometric mean gives us the value a score would have 
midway through the series if it were actually located on the 
exponential curve determined by the average rate at which the 
scores are increasing. The mid-score is, thus, located on a 
uniformly inflected curve while the arithmetic mean is located 
on a straight line fitted to the measures, if the scores are arranged 
in successive order as to size. When the values are greater than 
unity and positive, the geometric mean is always smaller than 
the arithmetic mean, as a comparison of the straight and curved 
lines mentioned above would indicate; and no meaningful 
geometric mean can be computed if one of the scores is zero. 
To the extent to which the series of scores is irregular instead of 
exhibiting a systematic positive or negative acceleration, to that 
extent a geometric mean would lack pertinenecy and meaning.! 

The Harmonic Mean.—The other measure of central tendency 
to come in for slight notice is the harmonic mean. It is the 
reciprocal of the mean of the reciprocals of the scores. Its 
formula, by definition, is 


a 1 
Whea a yp IN y (TEE 
n(atatat: pa DF 
N 


as 


This formula is properly employed where the data are given 
in a form that bears a reciprocal relation to a more significant 


H.M. = 


(Harmonic mean) (19) 


1A good elementary discussion of both the geometric mean and the 
harmonic mean, with many illustrations, can be found in J. E. Wert, 
Educational Statistics, McGraw-Hill Book Company, Inc., 1938, pp. 63-80, 
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and meaningful measure. Suppose, for example, data are given 
in terms of the number of problems pupils solved in an hour, 
while the number of minutes required per problem is believed to 
be the more straightforward measure. The harmonic mean 
would then be the one to use. The reader will find a good discus- 
sion of the conditions under which to use the harmonic mean 
in an article by Ferger.* 


VALIDITY or THE AssumpTions IN COMPUTING MEANS 
AND MEDIANS FROM GROUPED DATA 


In computing a median from grouped data one assumes that 
the measures in the interval in which the median lies are equally 
distributed through the interval; hence the point may be located 
by interpolation. In computing a mean, one assumes that the 
measures in each interval are centered about the mid-point of the 
interval. These assumptions are seldom fulfilled in practice. 
If the reader will observe the curve of the normal distribution 
in Fig. 10 below, he will notice that the frequencies in the several 


Als 


Fic. 10—Normal distribution and “trapezoidal” intervals. 


intervals make figures that are approximately trapezoids (except 
the middle one, which makes a double trapezoid). In order to 
fulfill the assumptions these figures would need to be rectangles, 
or other symmetrical figures. The mean of a trapezoid is not 
located at its mid-point but somewhere near the longer side. 
Hence, when the point around which the scores of an interval 
center is taken to be the mid-point of the interval, it is taken as 
farther away than it really is, and the moments are, in conse- 
quence, all too large. But in a perfectly symmetrical distribu- 
tion this distortion corrects itself. For the curve behaves 
identically on the two sides, so that an excess of plus moments 


1Fprcer, Wirta F., J. Amer. Statistical Assoc., Vol. 26, pp. 36-40 
(March, 1931), 
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on the one side is exactly neutralized by an excess of minus 
moments on the other side. It is also obvious that the median 
is not displaced in a symmetrical distribution on account of 
the “trapezoidal” shape of the interval, because at the critical 
position the curve is practically flat and the positive slope of 
the lower half of the mid-interval is precisely balanced by the 
negative slope of the upper half. But to the extent to which 
the distribution loses its symmetry, as all empirical distributions 
do, to that extent the assumptions about the distribution of 
scores within an interval lose validity. But the error from this 
cause is likely to be small. 

Another condition somewhat invalidating the assumptions is 
lack of uniformity of distribution within the intervals, due either 
to small sampling or to some selective factor. If the number of 
cases in an interval is small, it is unlikely that they will make 
either a regular rectangle or a regular “trapezoid”; their mean 
will be erratic and seldom exactly coincide with the theoretical 
mean of the interval. Hence means calculated from grouped 
data are likely to differ somewhat from those calculated from the 
raw scores and also to vary somewhat as intervals are changed 
in length or in placement, and the more so to the extent to which 
the number of cases is small. In consequence the data should 
never be grouped into intervals involving a wider range than one 
unit, unless there are at least 40 cases. 

Even if the number of cases is large, there is the possibility 
of irregular distributions within the intervals by reason of a 
selective factor favoring certain scores. Thus percentage 
grades are likely to show local modes at 70, 75, 80, 85, ete. The 
remedy in such case is to select the limits of the intervals in such 
manner that these modes come at the mid-points. But, of course, 
that principle could not legitimately be followed at the expense 
of keeping the intervals of uniform width. 


Exercises 


Table IV contains data suitable for practice by any students for whom the 
computation of the various measures of central tendency has not yet been 
sufficiently automatized. This table will be drawn upon also for exercises 
in connection with later chapters. It is, therefore, advised that the student 
preserve the distributions set up in the exercises of this chapter (and the 
statistics derived from them) for use on those later occasions, so as to fore- 
stall the necessity of making them again, 
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DATION TEST FOR COLLEGE STUDENTS, TOGETHER WITH INTELLIGENCE 
TEST SCORES AND GRADE-POINT AVERAGES AT END or COLLEGE 
CAREER 
(Girls are labeled G and boys B) 


History Genel Final 
Pupit | Diver) | English) Mathen | cionce | 884, | intelli- | 873d 
ature total matics social z point 
studies | 80° average 

B 47 105 46 49 38 56 1.19 
B 51° 103 55 58 82 78 1.11 
B 43 155 27 80 45 53 1.21 
G 42 154 97 52 20 92 1.83 
G 84 224 46 72 54 111 1.45 
G 80 254 56 100 97 126 2.27 
G 78 223 30 97 55 114 1.75 
G 43 149 34 78 23 71 1.56 
G 53 116 115 61 50 74 1.71 
G 60 148 51 94 36 87 1.72 
B 79 172 19 106 69 83 1.33 
B 46 155 127 78 68 95 2.30 
G 30 126 75 95 56 73 1.81 
G 93 259 84 82 78 118 1,71 
B 101 247 65 104 111 114 2.17 
G 50 163 21 89 35 74 1.02 
G 84 200 69 75 . 35 109 1.01 
G 55 170 58 58 47 97 1.71 
G 72 239 45, 85 10 98 1.86 
G 61 235 61 64 85 134 2.07 
G 100 300 43 90 123 139 2.04 
G 53 153 52 91 55 78 1.64 
B 67 206 112 118 41 113 2.05 
B 112 279 43 93 137 102 1.80 
G 55 178 21 74 32 96 1.61 
G 59 184 46 91 54 103 1.79 
G 83 p 205 54 118 60 96 1.15 
G 63 181 41 71 10 95 1.42 
B 79 253 62 129 55 116 1.33 
G 84 268 51 87 132 102 2.21 
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Taste IV.—Scores Mape IN Five Divisions oF THE CARNEGIE Foun- 
DATION TEST ror COLLEGE STUDENTS, TOGETHER WITH INTELLIGENCE 
TEST SCORES AND GRADE-POINT ÅVERAGES AT END OF COLLEGE 

CAREER.— (Continued) 
(Girls are labeled G and boys B) 


-— 
History Final 
pupil | Liter- | English | Mathe- | g; and | General) orade- 
pil z Science 3 intelli- È 
ature total | matics social point 
x gence 
studies average 
G 42 142 48 83 30 72 1.52 
B 51 187 30 39 42 98 1.21 
G 76 209 125 87 54 103 1.60 
G 89 272 48 107 72 134 1.89 
B 95 244 92 141 239 142 2.03 
B 56 137 47 85 11 75 1.21 
G 90 226 67 114 58 82 1.96 
G 81 221 55 125 72 110 1.45 
G 64 177 57 108 38 102 1.75 
G 84 197 43 94 44 93 1.76 
B 114 256 7 106 166 100 1,37 
G 94 273 54 128 105 123 1,74 
G 68 213 67 77 0 112 1.29 
B 75 233 33 98 110 90 1.24 
G 91 293 70 98 84 140 2.61 
B 63 168 48 87 70 90 1.04 
G 68 205 66 59 74 99 1.58 
B 52 124 67 122 7 79 1.25 
G 74 153 44 91 40 115 1.95 
G 34 128 16 63 38 87 1.01 
B 68 273 55 111 79 83 2.25 
B 56 213 155 138 44 131 Wee 
G 32 171 55 51 36 78 1.18 
G 64 166 67 81 52 94 1.45 
G 20 72 70 107 40 80 1.12 
G 87 258 152 107 60 110 2.06 
G 41 195 25 54 63 73 1.76 
B 67 227 30 78 68 116 1.44 
G 66 193 79 99 47 90 1.69 
B 66 156 89 111 73 70 1.08 
MS a at TA NT a A AARAA EE A le ET eB 
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Taste IV.—Scores Mane 1N Five Divisions or THe CARNEGIE Foun- 
DATION TEST ror COLLEGE STUDENTS, TOGETHER WITH INTELLIGENCE 
Test SCORES AND GRADE-POINT AVERAGES AT END OF COLLEGE 
CAREER.— (Continued) 

(Girls are labeled G and boys B) 


H: Final 
istory inal 
pupi | Titer- | English | Mathe- | goienco | and General) grado- 
ature total | matics social point 
studies | 8°7°° average 

G 78 187 42 87 61 88 1.31 
G 55 169 32 93 54 73 1.93 
G 116 313 29 87 114 144 2.14 
G 65 252 30 39 69 125 1.49 
G 67 162 37 61 69 100 +92 
G 107 224 25 90 87 136 1.40 
B 57 168 98 98 95 79 .81 
B 88 213 56 72 153 107 ygd 
B 45 101 37 74 42 46 1.46 
G 81 209 41 103 74 97 1.93 
G 75 185 32 90 32 94 1.39 
B 78 245 92 171 90 111 2.64 
G 51 155 113 77 41 81 1.49 
B 88 248 61 135 81 90 2,02 
G 65 163 56 99 112 68 1.33 
G 87 252 51 101 101 118 2.11 
G 71 195 50 57 ‘| S 113 1.62 
G 64 126 65 72 40 49 1.54 
G 69 228 40 113 76 117 2.20 
G 55 200 95 97 55 112 2.84 
G 74 197 30 45 79 103 1.18 
G 68 189 83 116 60 109 1.47 
B 63 192 19 117 55 106 1.48 
G 88 271 116 94 91 156 2.06 
G 27 98 19 37 36 79 1.06 
B 74 190 54 96 100 76 1.07 
B 51 172 50 71 58 73 1.76 
G 58 157 33 66 f 74 99 1.62 
G 51 127 29 58 46 93 1.03 
G 20 104 25 35 28 77 1.03 
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Taste IV.—Scores Mape 1N Five Divisions or THE Carnecw Foun’ 
DATION TEST For COLLEGE STUDENTS, TOGETHER WITH INTELLIGENCE 
TEST SCORES AND GRADE-POINT ÅVERAGES AT END OF COLLEGE 
CareER.— (Continued) 

(Girls are labeled G and boys B) 


REA EE AR E E eae 
8 Hinton General Binal 
Pupil Liter- | English | Mathe- Saids nd intelli- grade- 
ature total | matics social point 

ž gence 

studies average 
B 38 165 48 138 126 85 2.47 
G 61 212 46 141 68 101 1.79 
B 93 275 52 191 134 148 1.68 
B 55 156 48 76 93 102 1.59 
G 82 258 72 107 115 126 2.20 
G 69 188 36 120 79 86 1.89 
B 84 185 10 68 63 82 1.44 
B 54 153 121 111 81 102 1.23 
G 95 219 38 109 24 W7 1,45 
B 49 184 39 118 58 68 1.46 
G 35 122 48 97 137 73 2.00 
B 28 127 15 80 59 110 1.13 
G 14 141 40 51 il 99 1.22 
G 63 173 47 112 59 78 1.88 
B 97 236 79 138 186 131 2.10 
G 77 197 52 88 62 125 1.88 
G 94 273 54 128 105 123 1.74 
G 68 213 67 77 4 112 1.29 
G 34 128 16 63 38 87 1.01 
B 56 213 142 138 44 131 1.77 
G 67 81 52 94 1.45 
G 152 107 60 110 06 
B 30 78 68 116 1.44 


4. From Table IV compute the means of one or more of the columns by 
the adding-machine method; i.e.; by summing the individual scores and 
dividing by the number of scores. 

2. Confirm this mean by assuming a mean and then correcting for excess 
moments, taking the scores severally. 

3. Group the scores of one or more of the columns into frequency dis- 
tributions, and compute the means. Try intervals of various lengths, and 
compare the mean in each case with that of Exercises 1 and 2, 
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4, Compute medians from the distributions of Exercise 3. Compare 
means and medians. Try to account for any differences observed. 

5. Determine the mode for the distributions of Exercise 3. Compare 
means, medians, and modes. How does Pearson’s formula for computing 
the mode from the mean and the median hold out in these trials? 

6. The following table gives the number of pupils attending public high 
schools in the United States by 5-year periods from 1880 to 1925 and the 
ratio of the number at each period to the number at the preceding period. 
What is the most appropriate measure of central tendency to take for these 
data? Compute it. 


Taste V.—Noumper or Purs Arrenpine Pusiic HIGH SCHOOLS IN THE 
Unirep STATES FROM 1880 ro 1925 


Ratio of each pe- 
Year | No. of pupils | riod to previous 
period 
1880 110,227 
1885 160 , 137 1,453 
1890 202,963 1.267 
1895 350,099 1.725 
1900 519,251 1.483 
1905 679,702 1.309 
1910 915,061 1.346 
1915 1,328,984 1.452 
1920 1,857,155 1.397 
1925 3,065,009 1.650 


References for Further Study 


Fercer, Wirt F., “On the Use of the Harmonic Mean,” J. Amer. Statis- 
tical Assoc., Vol. 26, pp. 36-40 (March, 1931). 

Pearson, Karu: “Skew Variation in Homogeneous Material,” Trans. Roy. 
Soc. (London), Series A, Vol. 186, pp. 343f.; and Vol. 197, pp. 443-459. 
(The formula for mode in terms of mean and median.) 


CHAPTER III 
MEASUREMENT OF VARIABILITY 


Our preceding chapter dealt with formulas for finding some 
representative number with which to describe the general size 
of the scores of a distribution. In this chapter we shall take 
up, formulas for expressing the degree of scatter in the scores—the 
extent to which they are grouped closely about the central 
tendency or spread widely from it. Just as was the case in 
dealing with central tendencies, we may have two types of 
measures for variability—measures in terms of moments and 
measures in terms of the location of points. The former include 
average deviation and standard deviation; the latter include 
such measures as range, percentiles, quartile range, and many 
other interpoint ranges. We shall treat first the measures of 
variability in terms of moments. 


AVERAGE DEVIATION 


The method of measuring variability likely to be most familiar 
to a layman, or to seem most reasonable to him when mentioned, 
is average deviation. This involves merely subtracting each 
score from the mean and finding the average (mean) of the 
deviations thus obtained, algebraic sign being disregarded. 
The formula is, if x represents the deviation of a score from the 
mean and the enclosing lines indicate that these deviations 
are to be taken without regard to algebraic sign, 


_ 3a 
rats es 2 (20) 


If the data are grouped into a frequency distribution, the ’s 
will merely be multiplied by their respective frequencies before 
being added. If the deviations are taken in intervals rather than 
in scores (the former is the proper way), the A.D. will be in 
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intervals but can easily be changed to scores by multiplying by 
the width of the interval. The whole formula will, then, be 


Zfla| . . Sele 
ADIS ni i i Se from (20a) 


This simple formula answers very well if the number of items 
is small or if the mean happens to be a convenient whole number. 
But if the mean contains decimals, the number of digits involved 
in each subtraction process, and in the summation processes, is 
likely to be inconveniently large. It is then most convenient 
to take the deviations from some assumed mean that is a whole 
number and to make a correction to atone for the error that would 
otherwise be introduced. Let c be the distance from the assumed 
mean to the true mean. Then, if z is the deviation from the true 
mean and 2’ the deviation from the assumed mean in the case of 
any score, the x for each score above the true mean will be lz] — c, 
and that for each score below the true mean will be |z'| +c. Let 
us use the subscript } to refer to scores below the true mean and 
g to refer to those above the true mean. We shall then have the 
following: 

Zied = Ziel — fre, and 2x, = Lai] + fie 
Adding, 
Zed + Zed = Ziel = 2il + (fh — Se 
Dividing by N, 


— 2le| _ 2le’|+ (f — fe 
AD, = = te (21) 


As in finding a mean from a guessed average, the c = X2'/N, 
i.e., the sum of the deviations about the assumed mean divided 
by the number of cases. Normal account must be taken of the 
algebraic sign of the c. 

If the data are grouped in a frequency distribution, deviations 
should be taken in terms of intervals rather than in scores, each 
x should be multiplied by the proper frequency when adding, 
and the whole fraction must be multiplied by the length of the 
interval to get back to scores. Thus the formula becomes 


Ifl’ —fie. Average deviation when 
- A.D. = Zf + (hr = fod t oemi are taken from (22) 


N an assumed mean) 
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Each score that is greater than the true mean by no matter 
how little counts among the fps, and each that is less counts 
among frs. In a frequency distribution all the scores of a given 
interval count among the f,’s or the frs, according to whether 
the mid-point of the interval is above or below the true mean by 
no matter how little. Why this is true an examination of our 
formula will disclose. 

It is important to note that the above formulas can be used 
only when the assumed mean differs from the true mean by less 
than one unit (or less than one interval). This is because other- 
wise there would be between the assumed mean and the true 
mean some deviations that do not use up the whole of the c. 
If the guessed mean with which one has started turns out to differ 
from the true mean by more than this, one must start again with 
a mean that fulfills this requirement, unless he wishes to make a 
somewhat complicated adjustment for the omitted c units. 

Assumed Mean at Zero.—One can escape the limitation stated 
in the preceding paragraph by assuming the mean at zero, in 
which case all deviations become merely the scores themselves, 
The resultant formula is then of general application -besides 
having some other advantages, particularly if one is working 
with a calculating machine. Let X represent any score and M 
represent the mean. Then each deviation above the mean will 
equal X — M, and each deviation below the mean will equal 
M — X. Our summed deviations will then be 


2|z,| = =X, — f,M, and 2|z.) = fiM — =X, 
Adding, 
22] = (2X, — 2X.) + (fi — f,)M 


Dividing through by N and then making a rearrangement, 


A.D. = (2X, = 2a) + (ft a f,)M 


(Average devi- 
ation in terms (23) 


(2X — 22X) + (fr — fo) M - of raw scores) 
N 


By formula (23) the computation of A.D. with an adding 
machine is very easy—perhaps the easiest of all the variability 
measures. Without even taking the trouble to arrange the 
scores in order of magnitude, one merely sums them on the 
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machine to get DX and/N and thence 3X /N = M. Then he 
goes through the set of scores a second time, running in all the 
scores which are less than M to get 2X, and f;. Scores which 
exactly equal the mean, if any, may either be counted among 
the X, or among the X,. 

If we are dealing with a frequency distribution rather than 
with scores, we may work with the actual mid-point values, in 
which case the formula holds just as above (frequencies in the 
intervals being, of course, taken account of). Or we may work 
in terms of intervals instead of score values, then multiply by the 
length of the interval at the end of the process so as to get back 
to score values. For this purpose we may number our intervals 
in any way we please. But in this latter case we must replace 
the M of the formula by C, which as usual equals 3fX/N. Our 
formula will then be written 


aD. = Cie YX) + (= 100 


t (Average de- 


viation in 
L (IX = 23SX) + (f= HC, > frequency (280) 
N 


table) 


But since the M in the case of single scores wauld also be found 
by the formula 2X/N, just as is our C, this last formula in either 
of its shapes is of general application. For if the data are indi- 
vidual scores, the f in each summation is 1, and the 7 is 1 , 50 
that they may be ignored as factors in actual operative processes. 
This is a particularly useful formula when working with a 
calculating machine. One needs only sum the whole series, 
divide the sum by N, then go back and sum again the items 
that have values less than the 3fX/N in order to get the 2>/X,, 
the fi, and the f, demanded in the formula. 

Average Deviation from a Median.—An average deviation can 
be taken from a median, or from any of the other central tend- 
encies, by precisely the same techniques as from the mean. 
Usually, however, point measures of variability rather than 
moment measures will be employed in connection with a median, 
But it is worth noting that the average deviation is a minimum 
when taken from the median rather than from the mean or 
from any other point. 

We shall illustrate four procedures in finding the average 
deviation from. a frequency distribution. Three of them are 
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variations based on formula (23a); 1 is in terms of the actual 
values of the mid-points of the intervals, as shown in the column 
headed X; 2 is in terms of intervals with the numbering beginning 
at 0; while 3 is in terms of intervals with the numbering beginning 
at 1. 4 is based on formula (22). It will be seen that all four 
procedures give precisely the same result. 


Taste VI.—IĪLLUSTRATION or THE COMPUTATION or AVERAGE DEVIATION 


Mid- z EH [e'l 

from 
Score |value| f JX _|from | fx' | from | fz’ Je'l 

X 0 1 near 

mean 
20-24 22 3 66 4 12 5 15 2 6 
15-19 17 8 | 136 3 24 4 32 1 8 
10-14 12 20 | 240 2 40 3 60 0 0 
5- 9 7 12 84 1 12 2 24 ji 12 
OSA 2 10 20 0 0 1 10 2 20 
Totals...) .. 53 | 546 a 88 os 141 pe 46 


Mean = 10.3. Mean in intervals = 2.06 


L Ap, = 546 — 2-104 + (22 — 31)10.3 _ 245.3 


53 ep yee 
88 — 2-12 + (22 — 31) p 49.06, _ 
2ND. = 5 5 = OO 5 = 4.68 
a. AD, = ML = 2-344 (22 -3DW y 49.08 gn 
53 53 
_ 46 + (22 — 31)(-3)) p _ 40.06., _ 
Apes = 5 = 2200 5 = 4.68 


STANDARD DEVIATION 


Definition.—The standard deviation differs from the average 
deviation only in the fact that the deviations are squared before 
they are summed; then the square root of the mean of these is 
taken. The symbol conventionally used for standard deviation 
is ø, the Greek letter sigma corresponding to our lower case s 
—a practice upon which we shall comment in a footnote shortly. 
By definition, the formula for the standard deviation of an array 
of scores from the actual mean is 
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Sigma from an Assumed Mean.—It is seldom convenient to 
take the deviations from the actual mean, since such deviations 
usually involve decimals which are cumbersome to handle when 
squared. It is much more convenient to work from some 
assumed mean that will involve only whole numbers. Let c be 
the amount by which the assumed mean differs from the actual 
mean. Then, if x represents the deviation from the correct 
mean and 2’ the deviation from the assumed mean, for one 
score, 

et=2'—c,ore’ =2+e 


Squaring this deviation for one item, 

a” = x? + 2x +c? 
When we sum for the whole set of scores, the c? will enter as 
many times as there are items, thus becoming Nc?; and the 


various v’s, since they are different, will need to be represented 
by Zx?. Summing we get 


Iz” = Ye? + QW de + Ne? 


But Xv (in the middle term) equals 0, since it is the sum of the 
deviations about the actual mean and such sum always equals 
zero. The whole middle term will, therefore. become zero and 
drop out. We shall then have 


Ze" = Se? + Ne? 
Transposing, 
Da? = Ye!” — Ne? 


Dividing through by N and substituting o? for 2x?/N, 


faz" (Standard deviation from 
Os = Ni it ad an assumed mean) (24) 


It will be observed that this formula holds absolutely, not merely 
approximately. One need have no hesitation in applying it to 
a distribution of any shape or in taking his assumed mean any 
place he pleases. The result will be precisely the same whether 
working from the actual mean or from any assumed mean, 
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Zero as the Assumed Mean.—If the assumed mean is taken 
somewhere near the true mean, the numbers will be smaller and 
the arithmetical work consequently less laborious if done by 
hand. But there will be both positive and negative signs with 
which to worry, which are somewhat annoying in any case and 
particularly so if one is working with a calculating machine. It 
is often most convenient, especially when working with a machine, 
to place the assumed mean at zero. Then all deviations will be 
positive. Moreover, the deviations will be precisely the same 
as the scores, since each score differs from zero by its whole self. 
And c will be the mean of the scores in an ungrouped series, or 
in any case 2fX/N. The formula then becomes, where X 
represents any score, 


TBS N T — (2X)? 
EENEN EI N? 


a i \/N3X? = (SX)! (Standard deviation in (24a) 


terms of raw scores) 


The Population Variability—The measure of variability dis- 
cussed above is the standard deviation of the sample of scores one 
has in hand. If the size of the sample could be increased by the 
addition of further typical scores, the standard deviation would 
be slightly increased. As the sample approached the whole 
population in size (the “population” being a theoretically 
infinite number of individuals of the kind sampled in the distribu- 
tion we have in hand), the standard deviation would approach 
the limit! as follows: 


1 The conventional symbol for an estimate of the population variance 
from a sample is s?, as we have used it. However, best statistical usage 
reserves the Greek letters for “true” values, so that o? should be used to 
designate the theoretical population variance rather than the computed 
variance of the sample. But American practice is so far committed to the 
use of a as we have employed it in the early paragraphs of this chapter that 
we feel it would not be feasible to change at this stage. We shall, therefore, 
continue to use g? for the computed sample variance and, following the 
-Pearson School, employ the tilde over o, ē?, to indicate a theoretical popu- 
lation variance. We follow R. A. Fisher in using s? for an estimate of the 
population variance. Some other authors use s’? for the population value 
and s? for the sample value. 
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It is something of a nuisance to indicate square root each time we 
wish to talk about variability. So the term variance is used for 
the standard deviation squared. In this- terminology the 
estimate of the population variance is the sample variance multi- 
plied by N/(N — 1). Sinceo? = 22?/N and sè = o(N/N — 1), 
evidently 


Zr? i i 7 
= (Population variance estimated (25) 


(NERS SI from a sample) 


The proof sometimes given for the above formula for the esti- 
mate of the population variance is very complicated, involving 
the geometry of hyperspace. But a valid proof is really very 
simple. 

Let x be a deviation from a sample mean and q; a corresponding 
deviation from the mean of the whole population (which is the 
mean of all sample means). Then c, as we have used it above, 
is, for each sample, the mean of that sample. Therefore 

2r? = Zr; — NM} 

Zr = 2a? + NM} 
Sum for all samples, call them S in number, and divide by SN 
where N is the number in each sample. Also consider the o’s 
of the samples sufficiently alike to be treated as an average (the 
straight bar over a symbol denotes it an average and the symbols 
above and below > denote the limits between which sums are 
taken). 


a= toh 
On page 132 formula (64), we show thato?, = @3/N. Making this 
substitution and performing some simple algebraic operations, 
i N 
yi Nat = NA+? 


N 
= GON 


Nei — 63 = Nody 33 = 


=o, + 


So if we were estimating the population variance from our 
‘scores, we would merely divide by (N — 1) instead of by N. 
If the scores in terms of which we were working were deviations 
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from any other point than the actual sample mean, the proper 
adjustments could easily be made; for deviations from some other 
point than the actual sample mean we would have 

N32” — (22’)2 (Estimate of the population variance 


= — `T.. when deviations in the sample are (26 
z N(N — 1) taken from an assumed mean) ( ) 


It is chiefly in connection with formulas for standard errors, a 
phase of theoretical statistics which we consider later, that we 
need estimates of the population variance rather than the sample 
variance. In trying to give a sense of the scatter of an empirical 
distribution for descriptive purposes, it is the variance of the 
sample rather than an estimate of the population variance that is 
customarily employed. But if, in order to make the statistics 
more strictly comparable when the samples are very small and of 
unequal sizes, one wishes to express the variability in terms of the 
population estimate rather than in terms of the sample, one 
should be careful to call his statistic an estimate of the population 
variability and to use the letter s to designate it. 

Sigma from Grouped Data—When scores are grouped into a 
frequency distribution, they are all considered to be centered 
about the mid-point of the interval in which they occur. We 
consider all the scores in the interval 10-19, for example, to be 
represented by a value of 14.5; all from 20 to 29 by 24.5; ete. 
Hence, instead of adding these values one by one (after squaring 
them), we resort to multiplication which is merely an abbreviated 
form of addition. Our formula then becomes the following, or 
any of its algebraic equivalents as indicated above: 


c= a ~ Fy) 27) 


Nor do we bother to take these deviation values in score terms, 
since that would involve unnecessarily large numbers; we take the 
deviations, instead, in intervals, starting from the lowest interval, 
which we call zero. We can get back to score form by merely 
multiplying by the width of the interval. We do not lose a single 
iota of accuracy by this short-cut method. If we take our devia- 
tion in intervals, our whole formula then becomes 


DX? 5 Riia G lf la fi 
REGA aas = 
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For many purposes, especially if one is working with a calculating 
machine, the most convenient algebraic form in which to put this 
formula is the following: 


o, = 4 V NX = GX) (28a) 


This formula is really general in application. The X may bea 
deviation from any mean as well as from zero. The f and the ¢ 
are always implied in a formula whether expressed or not; they are 
merely ‘‘symbols of operation.” However, if one is working with 
single scores instead of frequency distributions, the f and the 7 
are each 1. 

Correcting for Grouping.— When one groups data into a fre- 
quency distribution for the calculation of a standard deviation he 
loses something in accuracy. For he treats his items as if they 
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Fia. 11.—Mean of an interval of a normal distribution versus the mid-point. 


were all at the mid-point whereas they are really scattered 
through the interval. When the deviation values are squared, 
those that lie beyond the mid-point should add relatively more 
to the moments than those that lie on the hither side. The 
matter is further complicated by the fact that the intervals 
normally make figures somewhat trapezoidal in shape. An 
examination of Fig. 11 will show that me, the mean of the scores 
in the interval around which the moments center, does not 
coincide with 7, the mid-point of the class. When any kind of 
moments, whether squared or not, are taken from the mean of the 
distribution to 7 instead of to Mme, the moments are too great. 
And the same would be true of the aggregate of the moments and 
of their mean, whether squared before adding or not. Hence 
both the standard deviation and the average deviation taken 
from grouped data with an interval range of more than one unit 
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are somewhat greater than the true ones. The same would be 
found to be true of all the interpoint variability measures to be 
discussed in our next section; all are somewhat too large when 
taken from grouped data. Sheppard has shown that, in a normal 
distribution, the correction to be made to the crude o? is —y 
when both the o? and the yy are in 7? units. Since all terms under 
the radical in the standard deviation formula when working with 
intervals as units (except, of course, the N) are in 7? units, we may 
write our corrected formula 


ye INe Saa (EN LN; 
g EEND T N N 12 
i N? (Standard deviation with 
= fina oy 3 Seams oneen) (20) 


In a technical note closing this chapter we give the proof of 
Sheppard’s correction. Although the reader who has mastered 
the calculus of our last chapter will be able to follow the develop- 
ment if he watches his step, it is unfortunately about the most 
difficult of the proofs we undertake to give in this book. 

Average deviation and the point measures of variability might 
also be corrected for broad categories. However, the corrections 
would be small, and, since we seldom employ these measures of 
variability in refined statistical work, the correction is scarcely 
worthwhile. Indeed this whole topic is introduced here not so 
much to urge making the correction.as to warn against the 
calculation of variability measures from few intervals without 
recognizing that the results may involve appreciable error. A 
little experimenting will show that the correction of 0.08333 in the 
standard deviation formula will make very little difference if the 
number of intervals is 15 or 18 but will make considerable differ- 
ence if the number is small. But note that the particular correc- 
tion, £y, applies only to standard deviation. 

Table VII illustrates the computation of the standard deviation. 
for the same data employed in the computation of average 
deviation in Table VI. In the illustration we take deviations 
from an assumed mean (at mid-point of interval 10-14) near the 
true mean, but all the processes would be completely similar if 
we were working from an assumed mean at the mid-point of the 
lowest interval or at any other place. 
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TABLE VII.—ĪLLUSTRATION or THE COMPUTATION or STANDARD. DEVIATION 


Shore _ Deviation Frequency ifs fut 
x f 
20-24 2 3 6 12 
15-19 1 8 8 8 
10-14 0 20 0 0 
5- 9 =1 12 —12 12 
0-4 —2 10 —20 40 
Totals (=)... Aes 53 —18 72 
= (N N2fz* — aN a 3,816 — Vale BA) 2 & 
z (Vna — sie" 2,809 (5) = (1.12)(5) = 5.6 


a= (lier eat 1) i = (VIZI = 0.0833)(5) = 5.4 


We have used w for the’standard deviation with Sheppard’s 
correction and o for the standard deviation from coarse group- 
ing. It will be observed that applying the correction here 
makes an appreciable difference because the number of intervals 
is rather small. The distribution departs considerably from 
normality, so that the assumptions involved in Sheppard’s 
correction formula are not strictly fulfilled. But the error from 
that cause is small. 

It is recommended that, for practice, the student compute the o 
from various other assumed means. 


POINT MEASURES OF VARIABILITY 


So far we have discussed two measures of variability that are 
put in terms of moments. Another method of measuring scatter 
is in terms of the distance between points in the distribution. 
This takes many forms. The process involves, however, no 
particular difficulties, so that we may pass over its discussion 
very hastily. The technique of locating any of the required 
points within the distribution is precisely the same as the tech- 
nique of locating a median, discussed in our preceding chapter. 
The principal interpoint measures of variability are the following: 

1. The Range.—This is the distance from the lowest score to 
the highest. It may be stated in terms of the difference between 
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the lowest and the highest score. Or one may say, and with a 
richer meaning than the former, that the scores ranged from — to 

2. The Median Deviation.—This involves subtracting each 
score from the mean or from the median, arranging the deviations 
in order of size regardless of algebraic sign, and finding the mid- 
point of the series. If the median deviation is to be taken from 
the median rather than from the mean, a less laborious method is 
available. But this measure of variability has little to recom- 
mend it, and it is seldom used. 

3. The Quartile Deviation, Called Q.—This is the most widely 
used of the point measures of variability. It is the distance from 
a point one-quarter through the distribution (Qı) to the point 
halfway through (Mdn. or Qs). Ordinarily it is taken as half 
the distance between the first and the third quarter points and is 
called the semi-interquartile range. ‘This method of computation 
has the effect of taking the average of the quartile ranges both 
above and below the median. The formula is 


Q= Qs = Qı (Semi-interquartile range) (30) 


4. The Inter-quartile Range, or the Range of the Middle 50 per 
Cent.—This is Qs — Qı. It may be stated as the difference 
between the two quarter points, but it is much more informative 
to give the scores at the limits; t.e., the middle 50 per cent ranged 
from — to —. 

5. P.E. is the same as Q except that it has become customary 
to restrict its application to the quartile range of a theoretical 
(consequently perfectly normal) distribution. The term should 
not be employed in describing the variability of empirical 
distributions. 

6. Range from the 10th to the 90th Percentiles, Called D—This is 
a highly reliable measure of variability that deserves more use 
than it has so far had. 

7. The ten decile points, located at the end of the distribution 
and at nine places within it so as to divide the distribution into 
ten equal parts. While this is not a measure of variability in the 
direct sense in which the others are (since no distances between 
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points are indicated) the location of the decile points does give an 
excellent account of the scatter of the distribution. Quintiles 
serve the same general purpose though not so completely. 

8. Percentiles—These divide the distribution into a hundred 
parts just as the decile points divide it into tenths. They might 
be located one by onè by the same techniques as those employed 
in finding quarter points or decile points, but it is ordinarily 
sufficient to get them by interpolation from a smaller number of 
locations, either arithmetically or graphically. One method is to 
locate the decile points in score values, each determined in a 
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Fira. 12.—Cumulative percentile curve, 


manner analogous to that illustrated for the median, then inter- 
polate roughly for the intermediate percentile points on the 
assumption of rectangular distributions within each of the nine 
interdecile ranges. Another method, and a better one, is to 
locate the percentile value of the top of each of the successive 
intervals by ascertaining how many hundredths of the whole 
distance through the distribution are covered by the frequency 
to the top of the interval in question, then to interpolate within 
each interval to allot roughly the intervening percentile values to 
the scores within the interval. For graphical determination the 
best way is to locate, on squared paper or on specially ruled paper 
(like the Otis Universal Percentile Graph), the score values at 
the tops of the successive intervals on the y axis and draw 
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through these points by hand a smooth curve. To determine the 
score value of any desired percentile, Pa, find or erect an ordinate 
P,/100 of the distance along the a axis from the location of the 
beginning of the frequencies to that where they end, and read the 
required value from the point along the y axis at which this 
ordinate cuts the curve. This is illustrated by the graph on page 
76, utilizing the data of Table II, page 45. 

As nearly as we can estimate from our setup, the 25th per- 
centile has a score value of about 139; the 50th, about 141; and 
the 75th, about 159. We could make a much more accurate 
estimate if the chart were large and there were accurately ruled 
guide lines.* 


SIZE OF SCORES AND VARIABILITY MEASURES 


Effect of Multiplying or Dividing All Scores of a Distribution by 
a Constant.—If all scores of a distribution take the form az, 
where a is a constant and « a variable, the standard deviation of 
the distribution becomes 


= Rcd mi eh _  fa?da’? ~ a?(2a")? 


N Ne 


Sr Dar N\2 
-a - (37) = doz (31) 


Thus, if all scores in a distribution are multiplied by a constant a, 
the standard deviation of the distribution also becomes a times 
as great. Obviously the same proof would hold if a were a frac- 


tion c/b. Therefore oes = (¢/b)oz. This same law could easily 
T 


be shown to hold for average deviation and for all the point 
measures of variability. 

Effect upon 6 of Adding a Constant to All Scores in a Distribu- 
tion.—It will next be shown that any constant may be added to 
all the scores of a distribution, or subtracted from the scores, 
without affecting the standard deviation. 


1A. S. Otis has devised a new percentile chart on which the frequencies in 
a normal distribution can be plotted on a straight line instead of the inverted 
S of the usual percentile curve. This is accomplished by spacing the abscissa 
lines in inverse proportion to the frequencies in a normal distribution. The 
chart is published by the World Book Company. 
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Nets +a} IE +a)’ 
Orta = aad 


N N? 
ai ae + 2a3e' + 2a? Sa" 4+ 22x! Sa + Ba? 
F N N: 
or ma + 2ad2' + Na Ba” + 2Nadx' + N%a? 
zm N N? 
EER ze ioi + Nia? Da + 2Naza! + N'a’ 
N? 


t2 _ S 
z2 gp =o. (32) 


The reader can easily verify the fact that, if we had used 
(a’ — a) instead of («’ + a), we would have emerged with the 
same result. It is thus proved that adding a constant to each 
score in a distribution, or subtracting a constant from each score, 
does not affect the standard deviation; it only moves the whole 
distribution up or down. The same law can easily be shown 
to hold for all the other measures of variability. 


MAKING VARIABILITY MEASURES COMPARABLE 
FOR DIFFERENT DISTRIBUTIONS 

It follows from our demonstration that cas = as, that the 
standard deviation of a distribution, as well as all the other 
measures of variability, is greatly affected by the order of size of 
its scores. A standard deviation of, say, 8 in one distribution 
does not necessarily mean greater relative scatter than a ø of 0.02 
in another distribution, for the scores in the former may all be of 
an order 400 times as large as in the latter. In order to make 
variability measures comparable, Pearson has proposed a 
measure, called coefficient of variation, that puts variability in 
terms of the mean of the distribution, since the mean responds 
directly to the general order of size of the scores. The formulais 


= ae (Coefficient of variation) (33) 

In spite of the fact that this measure has received considerable 
attention from statistical workers, the authors have doubts of its 
value. For the mean may be distorted by a padding of all the 


MEASUREMENT OF VARIABILITY 79 


scores. Consider the series of scores: 0, 3, 8, 12, 15, 20, 25, 29; 
and the series 20, 23, 28, 32, 35, 40, 45, 49. The mean of the 
first array is 14 and that of the second array is 34. The coefficient 
of variation of the first is 68 while that of the latter is only 28. 
Nevertheless the variabilities of the two distributions are pre- 
cisely the same, the distortion in coefficients of variation being 
due solely to the padding of the scores in one of the arrays. Asa 
matter of fact, if the zero point in any distribution is located 
where the scores begin to diverge, as it should properly be, and if 
the distribution is normal, the mean will always tend to have a 
value of about 3 sigmas, so that all coefficients of variation would 
tend to be around 33. Thus they would lose all value for com- 
parative purposes. They differ from 33, and hence seem to have 
a value, chiefly because of some abnormality in the placement of 
the zero point and only to small degree because of flatness in 
the distribution. Thus the coefficient of variation tells us much 
more about the extent to which the scores are padded by a dis- 
location of the zero point than it does about comparable vari- 
abilities. A much more promising standard measure of the shape 
of a distribution, comparable for all distributions, would be the 
measure of kurtosis called £z, for which the formula is 


za! (82, a measure of the kurtosis, i.e., 

Ba = Not the flatness, of a distribution) (34) 
But this measure has the disadvantage that it involves computing 
fourth powers of our scores, whereas for standard deviations we 
need only second powers. 


MEASURES OF SYMMETRY IN DISTRIBUTIONS 


If a distribution is symmetrical, its mode, median, and mean 
will all lie at the same point. If it is skew positively (i.e., has a 
larger tail stretching out toward the high scores than toward the 
low ones), its mean will be larger than its median and its mode will 
tend to lie below these two. If it is negatively skew, the reverse 
will be the case. Several measures of skewness have been 
proposed, but perhaps the following one is best: 
mean — mode _ M — [M — 3(M — Mdn.)] _ 


o o 


3M — Mdn.) (59 


Skewness = 


80 STATISTICAL PROCEDURES 


The value substituted for mode in the formula is that shown for 
it on page 54. 
Another measure of skewness, in terms of higher moments, is 
se (223)? (Bı, a measure of skewness in 
B= Ns terms of higher moments) (36) 
For symmetrical distributions, including normal distributions, 
B, is zero. For a normal distribution 82, mentioned above, is 3. 
We shall later give proof of this. 


COMPARABLE SCORES 


Scores from different types of data are likely to differ from one 
another very widely in general order of size and variability. 
Before they can be conveniently compared with one another and 
certainly before they can be legitimately averaged, it is desirable 
to put all of them in terms of similar units. One way of doing 
this is to take all of them as deviations from the means of their 
respective distributions divided by the standard deviation of the 
distribution. Thus, if X is a score and M, the mean of the 
distribution to which X belongs, our deviation is X — M,, and our 
standard score is 


x X- M. hEn (“Standard score” also (37) 


Ža Oz On called a z score) 


All z scores are comparable since they all tend to range from 
about —3 to +3, have a mean at zero, and a standard deviation 
of 1. That the mean is zero follows from the fact that, in any 
distribution, the deviations above the mean and those below the 
mean sum to zero. That the standard deviation of a full set of 
z scores is 1 may easily be shown as follows: 


E Eee 
Oo SNIENE oe evi 


We shall later find that z scores have the further advantage that 
the mean of the products of paired ones gives directly the coeffi- 
cient of correlation between the two arrays. 


COMBINING SIGMAS FROM DIFFERENT SAMPLES 


Sometimes it is necessary to combine the standard deviations 
from a number of different samples, and the worker either does 
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not have available the original scores or wishes to avoid the labor 
of an additional computation from the consolidated samples. It 
will not do simply to average the o’s. But, if the means of the 
samples are known as well as the o’s, the standard deviation for 
the consolidated set of samples can be correctly determined as 
follows: 

If x’ denotes the deviation of a score from a sample mean and 
zx its deviation from the weighted mean of all the samples, 


Dal’ = Nirt; Da? = Ni? + Nim? 
Dah’ = Na; Da? = No? + Nom? 


D3r? = Nii + Novi + Nag + - > + + Nici + Nimi 
+ Nomi +--+ + Nam? 
done +: ee + Nic? + Nimi 7 
ETS 


: 3 _ + Name+ +++ + Nam} 
NFN FNF OA +N, 


(Formula for combining o’s from different samples) (38) 


where my is the difference between the jth sample mean (j being 
any sample) and the weighted mean of all the means. 


RELATIONS BETWEEN THE VARIABILITY MEASURES 
When we come to the chapter on the normal curve, we shall 
find reasons for the following relations among measures of vari- 
ability. They hold strictly only for perfectly normal distribu- 
tions but will be found to represent the relations pretty closely in 
most of the distributions met in practice. 


Q = 0.67450 o = 1.2532 A.D. 
A.D. = 0.79790 o = 1.4825Q 
D = 2.56310 A.D, = 1.1830Q 


In the chapter on Reliability (page 151) we shall find that the 
standard deviation is the most “reliable” of the variability 
measures; its standard error is least of all in terms of its own 
magnitude. After this comes A.D., then D, then Q. This, and 
certain other mathematical properties, are given as reasons why 
the standard deviation is to be preferred to all other measures of 
variability in refined statistical work. 
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But, in spite of these advantages in theoretical statistics, the 
standard deviation is not a very apt statistic to use in describing 
variability for lay readers; it is “Greek” to them. Aside from 
the range, the variability measure likely to carry the most con- 
crete meaning to laymen is the range of the middle 50 per cent. 
Of the moment values, it is the average deviation rather than the 
standard deviation which will seem most sensible and meaningful 
to such readers. As far as reliability is concerned, the superiority 

-of the standard deviation over the average deviation is so slight 
as to leave the average deviation a useful statistic to employ in 
describing an empirical distribution, especially when addressing 
a lay audience. 


AN INDEX OF INSTITUTIONALIZATION 

Professor Floyd H. Allport and his associates have studied 
the conformist behavior of individuals under the pressure of the 
mores, or other sanctions, and have found that a j shaped curve 
frequently describes it. When coming upon a stop sign, for 
example, most automobile drivers may come to a full stop. But 
some may merely slow up to a near stop, a smaller proportion 
slow up less, and a small proportion may go ahead without any 
slackening of speed. If units of degrees of slowing are placed on 
an X axis and frequencies on a Y axis, the distribution of these 
frequencies will be shaped like a j, or like a reversed j. The 
statistics of j shaped curves have not been very fully worked. 
Such statistics as means and standard deviations are, of course, 
formally applicable to this type of distribution as well as to other 
types, but they do not seem to give a very apt description of the 
situation here involved. We are suggesting as a useful statistic 
for this purpose 6{, which has the conventional meaning of Bs 
except that the moments are to be taken as deviations from the 
norm rather than from the mean. Let us approach this through 
an example. 

Frederiksen, Frank, and Freeman! noted the behavior of 
motorists at a sharp turn on a multiple-lane highway, where 
safety demanded that cars should remain in their own lanes. 
They recorded the number of cars which 

0. Conformed to the standard by staying completely in line. 

1 FREDERIKSEN, N., G. Franx; and H. Fremman, “A Study of Conform- 
ity to a Trafic Regulation,” J. Abn. and Soc. Psychol., Vol. 34, p. 120 (1939). 
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1. Crossed the white line less than half a car width. 

2. Crossed the white line more than half a car width but did 
not cut lanes. 

3. Cut lanes. 
The percentages in each of these cases are given below for cars 
driven by private chauffeurs, and beneath the line of frequencies 
per hundred are the calculations required for Bi: 


Unrrs or DIVERGENCE FROM THE Norm 


k 0 1 2 oun) fi Sum 
à 85.3 12.1 1.7 0.8 100 
z? 0 1 4 9 — 
fa? 0 12.1 6.8 7.2 26.1 
Tah 0 12.1 27.2 64.8 104.1 


rote, B2/N_ NBet _ (100)(104.1) _ 5 5 
P int G/N Gee Gee = 1: 


For taxi drivers the percentages in the four classes were 


0 1 2 3 
80.4 14.3 3.1 2.1 
By = 11.8 


The f gives a measure of conformity which increases in size 
as the extent of institutionalization or socialization increases. It 
will be noted that private chauffeurs are more susceptible to the 
pressures upon them—their behavior is more institutionalized— 
than are the taxi drivers. This index of institutionalization 
would be 1 for complete nonconformity. For a chance allocation 
of frequencies into a rectangular distribution (with the norm 
taken as the first class), it would be 2 for four classes and nearly 
2 for other numbers of classes, the formula for its exact value 
being 

6(3n? — 3n — 1) 
5(2n? — 3n + 1) 


where n is the number of classes along the x axis. For a normal 
distribution (with the norm at the mode) it would be 3. As the 
extent of conformity increases so as to make a more and more 
narrow stemmed j, the index of institutionalization increases and 
approaches infinity as the conformity approaches completeness. 
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This procedure assumes equal spacing of the units of extent 
of conformity along the x axis. But we can see no alternative to 
this. It would be possible to take the moments for 62 from the 
mean, as‘is the custom, but we believe taking them from the 
norm gives for this purpose a much more meaningful and useful 
index. If the investigator’s purpose is not to measure the degree 
of institutionalization but instead, or in addition, to express the 
slope of the curve in terms of an equation, he has available the 
possibility of fitting to his data one of several types of curves, 
including the curve of decay which we discuss in Chap. XV. 


PROOF OF SHEPPARD’S CORRECTION FORMULA 

The reader is warned that the following discussion is highly 
technical. The student of relatively elementary statistics should 
skip it. 

As a preliminary to the development of Sheppard’s correction 
formula we shall need to develop Taylor's formula (“Taylor’s 
series”) because it is involved in the Sheppard development and 
was not included in our chapter on calculus. 

Let S be the sum of a power series in terms of (x — a), where 
a is a variable and a is a constant. Then the sum series must be 
a function of x, and, if we may assume that the series converges, 
we may write as follows: 


(A) S= f(x) = bo + bila — a) + bow — a)? 
+b- a)’ + -+- 


where the coefficients bo, bi, bz, ete., remain to be determined, 
It is our purpose now to get values for these coefficients. 

If we substitute in Eq. (A) z = a, we get bọ = f(a), for all the 
other terms become zero and drop out. That is, the first term 
on the right equals the value of the function on the left when x 
is evaluated at a. Take now the first derivative of f(x) and get 


f'(z) = by + 2ba(e — a) + 3bs(x — a)? + tbla — a)? + `- 


Substituting again z =a, we get bı = f'(a). That is, the 
coefficient of the second term is the first derivative of the function 
f(x) when that derivative is evaluated at z = a. Take now a 
second derivative 


f(z) = 1- 2b +12: 3ba(e — a) +3-4bi(@ — a)? h + 
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Letting x = a, we get from the above 


f(a) 
1-2 


1+ 2b: = f” (a); bs = 


The third derivative is 
fe) = 1-2+3bs +1-:2:3-4b(z—a) + 
When this is evaluated at x = a it gives 


1-2+3b; = f(a); bs = fe 


If we continue this process, we shall obviously get the following, 
when f(a) stands for the nth derivative of the function f(z) when 
the derivative is evaluated at x = a, and |n stands for factorial 
n, i.e., the product of all the integers from 1 to n inclusive: 


(B) S=fla) + ay (æ — a) + p (z — a)? 


tee e ee ee 


That is, if f(x) is developable into a power series in (x — a), the 
coefficients of the successive powers of (x — a) must necessarily 
be the successive derivatives of f(z) when these derivatives are 
evaluated at x = a, divided by the factorial of the power of 
(a — a) in the respective terms. 

That is Taylor’s series. We can now put it in a form more 
useful for our immediate purpose by replacing a by æ; and then 
letting x = a + h =a; + h, where (2; + h) varies over a subset 
of x values within which subset z; is constant and h is a variable 
increment. Making in (B) these substitutions, 


© g= fiait h = so) + FP n AEE eC ws 
Bie era (x) h” + 


This second form of Taylor’s series enables us, when more 
convenient, to shift from the development of a power series 
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in terms of one variable to its development in terms of another 
variable. 

We are now ready to take up the development of Sheppard’s 
formula for the correction of a standard deviation for broad 
categories. In the discussion we shall need to anticipate some 
facts and symbolism about the normal curve which we treat in 
full in a later chapter. 

The standard deviation is the square root of the sum of the 
squares of the individual scores (in deviation form) divided by 
the number of cases. But an integral is a sum, and in this 
development we shall freely replace the conventional summation 
sign by the symbol of integration. Remember that the scores 
are laid off as to size on the x axis and that the height of the 
curve, denoted by y, expresses the frequencies. Basically a 
standard deviation is determined from scores, but in practice 
it is often convenient to group scores into intervals and to treat 
these intervals as scores themselves. As pointed out on page 72, 
this grouping makes a difference, and it is the purpose of the 
present development to derive a formula for inferring the correct 
standard deviation, which would be obtained by working from 
individual scores, from the one obtained by working from 
intervals as units. 

The height of any ordinate of the normal curve is dependent 
upon its place along the x axis; i.e., the y is a function of x and 
may be written as f(x). Since the standard deviation must 
sum the squares of all scores in the distribution from —® to 
+ ©, we may write it, 


(D) Not = f zfs 


But when we work with grouped data, we do not consider the 
a placement of every individual score but take all within an 
interval as having the value of the mid-point of the interval. 
Let us call such mid-point x; The frequency corresponding 
to any 2; value will then, of course, be the population within 
the particular interval. If, for the sake of distinction from the 
corrected e, we place a bar under the o to indicate that it is the 
standard deviation computed from mid-points of intervals our 
formula will stand as follows: 
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+o 
(®) Na = Dai fop TO 


where h is the length of the interval. 

Let us now make the substitution « = 2;-+ u. Then since 
z: is a constant for any interval, dx = du. Our limits of integra- 
tion will now be —4h and +43h, and we may write 


ae 


ti 
(F) Ng = De S(t: + udu 


We shall now utilize Taylor’s formula, (C) developed above, 
to expand f(a; + u) in powers of u, replacing f(a; + u) in (F) by 
this value. Then 


das +h TAS Cp: 
O w= Dafi eotie 
+L ag hiaai 


Indicating the integration term by term and taking the summa- 
tion with each term, (G) may be written 


Œ) Not = 5 z 1 ea Iau + 5 Y (ee Lada 
a5 5 xy fr a wdu + 5 z? nie Lt udu +- 


Integrating as indicated in (H), we get 
4 +i S [f @)ut 
D Not 2 7 i, 
U Noi = 7 [sou] ah 5. Ji ‘ite 


E Sree 


We must now evaluate expression (I) between the limits speci- 
fied; i.e., we must substitute in each term —3h and +4h and 
take the difference between the values at these two limits. 
When we do so, each term containing an even power of u will 
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drop out, since the upper and the lower limits will yield the same 
numerical value and will have the same sign. We are thus left 
with the following: 


+o 
U) Nab =h Y fe) +h S ale) me 


5 aP e) 
ah Sx 25 


The values of the summation terms may be found by the 
Buler-Maclaurin sum formula. This formula puts the expression 
of a sum in terms of an integral and certain further terms which 
themselves involve derivatives. Since in our special case we are 
dealing with the normal curve function and since all the terms of 
the Euler-Maclaurin sum formula for this case, except the first, 
involve derivatives of the normal function which are to be 
evaluated at the limits — «© and +, where they equal zero, 
all the terms except the first in each summation drop out. The 
Buler-Maclaurin formula also involves 1/h as a coefficient of 
each integral when the sum is equated to it. Hence, applying 
the Euler-Maclaurin formula to (J), we are left with 


E) Net = ‘es wae tie E mg 


+ foe j Or S, 


Let us now consider in succession the terms on the right. 
Remember that f(x) is y, the frequency; and notice that æ has 
replaced z; From (D) we have that the first term is No}. 

The second term requires multiplying x? by the second deriva- 
tive of the normal curve, integrating, and multiplying by h?/273 
which equals h?/24. On page 27 we Lanes Ge the second 


os (a). Sub- 


stituting this value for f” (x) in the second term ie No? for the 
first term and neglecting the remaining terms, which when 
evaluated are found to have values so small that they may be 
considered trivial compared with the value of.the first two, we 


derivative of the normal curve function is = 
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may write 


+e af 
Net = North [ e (Z e)z 


— o o 


Separating the second term into two integrals, 


SW bead cr Dy ER pts oats 
Nat = Noth | Lae a LRE) da 


In the normal curve chapter we learn that the quantity 


Í aie] 

PaA NTA 

which is called 82, equals 3 for a normal distribution. The 
second-term integral has, therefore, the value 3N. As we have 
seen before, the value of the integral in the numerator of the 
third term is No*. Making these substitutions, we have 


2 


Canceling the N appearing in each term and combining the last 
two terms, we are left with 


Transposing and indicating the square root, we get the formula 


kë 
2 = 2 
Ka Ve 12 


The reader must be cautioned that the g2 under the radical 
has been developed in terms of score values. In practice it 
will, in a frequency distribution, have been calculated in terms 
of intervals. By applying formula (31), we can put this in terms 
of intervals. Calling o2 the variance calculated in intervals as 
units, 


a he: 3 I Sieros ai 39 
= TEETAR =m tl ’ 
ae hog 12 UNG 12 EE EEA o ») 
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Exercises 


1. Using the distributions set up in the exercises from Table IV in Chap. 
II, compute variability measures to as great extent as may be needed to 
bring you to the necessary masteries. 

a, Standard deviations. 

b. Average deviations. 
. Percentiles. 

. Decile points. 

. Quartile points. 

. Interdecile ranges. 

. Interquartile ranges. 

h. Semi-interquartile ranges. 

2. Compute one or more standard deviations from distributions grouped 
into broad categories (three to six intervals) and apply Sheppard’s correction. 
If you have employed the same data from Table IV from which you com- 
puted o’s in Exercise 1, compare the o obtained from broad groupings with 
that obtained from individual scores or from groupings in narrow ranges. 

3. Compute measures of skewness for the distributions with which you 
have worked. 

4, Turn a sample of scores from one of the distributions into “standard 
scores.” How nearly does the mean of the standard scores of this sample 
come to zero? Why the discrepancy? 

6. For a sample of about 40 scores from one of the distributions compute 


ans Bo 


Ba 


6. For this same sample compute A. 


References for Further Study 


Dickey, J. W.: “On the Reliability of a Standard Score,” J. Educ. Psychol., 
Vol. 21, pp. 547-549. 

Horst, Pavu: ‘Obtaining Comparable Scores from Distributions of Differ- 
ent Shapes,” J. Amer. Statistical Assoc., Vol. 26, pp. 453-460. 

Suupparp, W. F.: “The Calculation of the Moments of a Frequency Dis- 
tribution,” Biometrika, Vol. 5, pp. 452-458. (On the same topic see 
also Pearson in Biometrika, Vol. 3, pp. 308-309.) 

Rerrz, H. L.: Handbook of Mathematical Statistics, 1924, p. 30. (An 
additional correction in A.D. for the interval containing the mean.) 


CHAPTER IV 
THE BASIC FORMULAS OF RECTILINEAR CORRELATION 


Correlation relates to the extent to which two series vary con- 
comitantly. We can compute a coefficient of correlation when, 
and only when, scores in two related series are paired. Thus we 
can determine the coefficient of correlation between history scores 
and geography scores if each of a set of students has a score in 
history and a score in geography. We can correlate the scores of 
a set of boys with those of a set of girls if both sexes have taken 
the same test and our concern is to see how closely the two sexes 
parallel each other in the proportion knowing the several items 
of the test, for here each item has a pair of scores. But, apart 
from the binding of the two series by paired scores, it is not 
possible to apply correlation methods in the technical sense. 

It is important that a student should understand the nature of 
correlation, not merely work with its formulas as magic. The 
principle back of correlation is really very simple. Suppose a 
student makes the score of 9 points above the mean score for his 
group on a history test and also 9.points above the mean on a 
geography test. We shall lay this off on Fig. 13 by going 9 
units to the right from the intersection of the two central axes for 
the geography score and then 9 units upward to represent’ the 
history score. Point A, therefore, represents the location of this 
student with respect to both his scores. Suppose student B 
makes 12 above the mean in history and also 12 above the mean in 
geography; C makes 14 below the mean in history and 14 below 
in geography; D makes 8 below in each; and HZ makes 15 above in 
each. It is easy to draw a straight line through all of these 
points; and this line will pass through the intersection of the XX 
and the YY axes, which point is technically termed the origin. 
At point A on this line the perpendicular distances to the XX 
axis and to the YY axis are equal. That is, AS = SO, whence 
AS/SO =1. A corresponding thing is true if we take other 
points on the line: B, C, D, E, or any other. The value repre- 
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sented by the ratio between these two legs of the right triangle, 
also, is called the slope of the line which constitutes the hypot- 
enuse, for which value we shall employ the letter b. In our 
problem, b is. evidently 1. Each y value is, therefore, equal to 
1 times the corresponding x value, and the relation is one of 
perfect agreement. 


j 
+10 Ht A 


+5 


FIS. -10 Ex +5 HO 15) 


“10 


y 
Fie. 13.—Perfect correlation. 


But let us next post, in Fig. 14, dots representing the scores 
in Table VIII, page 98. These dots have a tendency to fall 
along a straight line, but we would have a difficult time to draw a 
single line through all of them and certainly no straight line could 
be made to pass through them all. But we can draw a straight 
line that passes through the group of them and that represents 
the general trend of the group of points as nearly as possible. 
This line will have a slope which will indicate the general tendency 
of the scores in the one series to be greater or less when those of 
the other are greater or less. We shall call the slope of this line 
b, as before. 
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But how find the value of this 6? The answer to that question 
constitutes the essence of a correlation formula. 


One can make 


Taste VIII.—ILLUSTRATION OF COMPUTATION OF CORRELATION BY 


INDIVIDUAL PAIRS 


Data, scores on the Abbott-Trabue Test of Appreciation of Poetry by the 
same pupils at interval of 5 months, slightly doctored 


e E 
First Second | Deviation | Deviation 
Individual| score, score, |of X from | of Y from | z? | y? | zy 
X y- mean, | mean, y 
A 4 4 -1 -1 1 1 { 
B 7 6 2 1 4] 1] 2 
Cc 7 7 2 2 4| 4| 4 
D 8 7 3 2 9] 4] 6 
E 6 5 1 0 1 0; 0 
F 7 5 2 0 4; 0; 0 
G 2 5 -3 0 (A Fos e R 
H 8 i 3 2 9j 4| 6 
ti 7 8 2 3 4| 9| 6 
Gi. 6 6 1 1 ry er A 
K 3 4 —2 a1 cm Me A bee 
L 3 2 —2 -3 $ i 9| 6 
M 5 3 0 —2 0; 4) 0 
N 4 4 -1 -1 seh al 
0 5 3 0 -2 0| 4) 0 
IP; 3 9 —2 4 4 | 16 |-8 
Q 4 5 =a 0 1} o} 0 
R 4 3 -1 —2 1 4| 2 
S 3 1 -2 —4 4 | 16 8 
T 6 6 1 1 1 1 1 
U 9 8 4 3 16 | 9} 12 
Vv 2 5 -3 0 9| 0) 0 
wW 6 6 1 1 1 1 1 
X 5 3 0 -2 0| 4| 0 
Y 1 3 —4 -2 16| 4| 8 
= 108 | 98 | 59 
2 Rs IS SD ee eh oy 
pe oe a e 
v2: dy? v/108 -98 


an empirical estimate of b by stretching a string and adjusting it 


' until it seems most nearly to fit the measures. 


The student is 


advised to try that method with the problem of Fig. 14, Start- 
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ing from any point whatever on his thread, he should count the 
number of units on the squared paper down (or up) to the XX 
axis, then the number along this axis back to the origin (where 
the axes intersect). The former divided by the latter is the slope 
of the line and approximately the coefficient of correlation. 


Fic. 14.—Positive but imperfect correlation, 


The Pearson product-moment correlation formula is merely a 
more precise device for finding the slope of this line. It is based 
on a principle, generally accepted by mathematicians, that a line 
best fits its data when the sum of the squares of the misses (errors) 
is a minimum. The development of a formula for the slope of a ` 
line that fulfills this condition is very simple, but it involves a 
little calculus. 

Let b be the slope of the line required (the straight line that 
best fits the trend of the paired measures); let x be a given score 
in the first series (in deviation form); let y be a corresponding 
score in the second series; and let g be the value this y score 
would need to have if it were to lie exactly on the regression line. 
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Then, by definition of b, 
7 = br 
The “error” by which y misses 7 is (y — g). The condition of 
best fit is that the sum of the squares of such errors for all the 
pairs of scores in the problem shall be a minimum. Hence 
Z(y — ï)? is to be a minimum. Substituting for 7 its equivalent 
bx, squaring, then placing the summation sign with each member, 
which is a legitimate way of summing such a quantity, we have 
Z(y — 9)? = Bly — bz)? = (Dy? — Wry + b*Dx?) 

is to be a minimum. Since we are to find a value for b that will 
make this quantity a minimum, we must differentiate the expres- 
sion with respect to b and set the derivative equal to zero (see 
page 10 of this volume). Since the first term contains no b, it 
will disappear from the derivative. In each of the other two 
terms the elements other than 6 will be unaffected by the dif- 
ferentiation, but the b will have its exponent decreased by 1, and 
the coefficient of the term will be multiplied by what had been 
the exponent of the b. Thus differentiating, we get 


—22ay + 2b=a? = 
Transposing, then dividing by the coefficient of b, 
(Formula for the slope of a 


>>) straight line fitting the 
2b2a? = Wary; bys = Be Heni, of paired measures (40) 
Br so as to minimize the y 
residuals) 


That is the formula for the slope of a line fitting paired meas- 
ures so as to minimize the y residuals; it is called the regression 
formula for y on x. Frequently it is used in just that form,. 
especially in business statistics. But we may put it in more 
familiar shape if we divide both sides of the equation 2b=2? 
= 2Daxy by 2N, N being the number of pairs of scores in the 


problem. 
i p ZZ _ Zw 


N N 


Now we have seen 2x?/N before; it is o2. Making this substitu- 
tion, : 

Ery 

No? 


zry, 
N’ 


bo? = pes 
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. 

But still our formula for the slope of the line lacks a standard 
meaning, because x and y may be measured in different units, and 
the slope is greatly affected by the relative variabilities of the 
measures employed. We shall remedy this by choosing a new 
symbol, r, for the slope of the line when our measures have been 
taken as x/o, and y/o,, thus making the measures of equal 
variability. In this notation 


ij x ye T, 
Z= r =~; therefore 7 = rx 7 
oy Oz Oz 


But in our former notation, 7 = bz. Therefore 


0; g; Oo; 
re =br;r = b=, and b =r 
Oz oy T. 


2 


Substituting the value of r thus derived,* 


6. Sry o. Ir (Pearson product-moment 
r=b—= N y nans N. y formula for coefficient of (41) 
Cy of Oy Oy correlation) 


If we choose to do so, we may put this basic correlation formula in 
a little different shape by substituting for o, its value ~/2x?/N 
and for cy its value V 2y?/N and have 


A barely HAN 
Nv/(2a?/N : Sy?/N) 

= Zy _ (Second form for the basic Pearson prod- (41a) 
a/ D2? + Dy? uct-moment correlation formula) 


This is the principal formula for r, the Pearson product-mom- 
ent formula, whenever the measures are taken in the form of 
deviations from the means. It is, you see, merely the formula 
for the slope of the straight line best fitting the measures when 
the correlation chart has been laid off square—when the varia- 
pilities in the two directions have been equalized. This line is 
called the regression line. The student who knows a little trig- 
onometry will see that r is the tangent of the angle that the 
regression line makes with the X axis under the special condition 


1 The standard deviations of the samples must be used in this and in subse- 
quent formulas for correlation, not the population s. If population values 
are used in the denominator, they must also be used in the numerator, and 
the two corrections precisely cancel each other. 
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that the variabilities of the two sets of measures shall have been 
equalized. When we gather our data into columns, as is done in 
Table IX, the regression line becomes the straight line most 
nearly fitting the means of the columns as well as the straight line 
most nearly fitting the separate measures. For this reason it is 
sometimes called the line of the means. An r may be calculated 
either from the individual paired scores or from a correlation 
chart in which the scores have been grouped into intervals. The 
formula has precisely the same fundamental meaning and essen- 
tially the same form when using either arrangement. As the 
student goes on through statistics, he will have many occasions 
to marvel at the unexpected ways in which this correlation 
formula crops up and at the transformations through which it 
can be put. Itis one of the most fascinating formulas of mathe- 
matical science. 

An inspection of the formula will show that the new element 
involved in correlation is Say; the sigmas we have treated in an 
earlier chapter. Say involves multiplying each x by its paired 
y value and then summing these products algebraically (the 
multiplying being done, of course, pair by pair before the adding). 
The multiplying may be done pair by pair, or the zy products of a 
like value may be grouped into frequencies and each vy value 
multiplied by its frequency before addition. It is this latter 
thing that one is doing when he computes an r from a correlation 
chart. Sample solutions are shown on pages 93 and 100. 

The formula for r that we developed above, 


_ ty 
es N Toy 
was based upon measures that are taken as deviations from the 
exact means of the series to which they belong. But it is seldom 
desirable to take deviations from the true mean in working a 
problem in correlation, since customarily we get decimals which 
are cumbersome to handle. We do better to work from some 
convenient assumed mean even though we may know the true 
mean, just as was the case also in computing standard deviations. 
The development of a correction formula that will allow us to 
use an assumed mean and yet get the correct r is very simple. 

Let x be the deviation of a score from the true mean in the X 
series and y a corresponding deviation in the Y series. Let 2’ 
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and y’ be, correspondingly, deviations from the assumed means. 
Let c, be the amount by which the assumed mean in the X series 
differs from the true X mean and c, be a corresponding value in 
the Y series. Then for any one pair, ` 


a =r F én andy = yY Fc 
The product of any one pair will be 
aly! = (% + ce)(y + cy) = ty + aly + Yee + Coby 
Summing for all the pairs, 
De'y! = Tay + cyBr + cody + Lewy 


But, since x and y are the deviations from the true means, 
their respective sums equal zero. Hence the two middle terms 
become zero and drop out. Also EczCy becomes NczCy, because 
this term is taken once for each pair. Therefore 


Ya2'y'’ = Ery + Nexcy 
‘Transposing, 
Lay = Dry — Ney 


Substituting this value in the original Pearson formula above, 


ue (Za’y'/N) — czy (Ons form of the product-moment 
= ormula when measures are (42) 
Try taken from assumed means) 


In dealing with the c’s in the above formula, it must be remem- 
bered that the algebraic sign is to be considered. Sometimes 
the product of the two c’s must be added arithmetically to the 
rest of the formula instead of subtracted—if one happens to be 
positive and the other negative. 

We can, perhaps, simplify this formula further for ease of 
computation by noting that c=, the amount our assumed mean 
in the X series missed the true mean, is always equal to 22'/N and 
similarly c, is equal to Zy’/N. This is true no matter where the 
assumed mean is taken, as we have already learned in our chapter 
on central tendencies. It is true even when the assumed mean is 
taken at zero, in which case all the deviations will be precisely 
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the same as the corresponding scores and the c’s will be exactly 
the means of the respective series. We shall, therefore, have 
perfectly general formulas if we substitute these values for the 
c's. We may also substitute for our o’s equivalent values we 
learned in our chapter on variabilities. Then 


Dry! — Nesy 
R Nosy 
pe Da'y’ — N(22'/N) - (2y'/N) 
NV (22/N) — (B2'/N)? V/(2y2/N) — (2y /N} 
E Za! + Zy! 
N 


7 7 
J>" = ca Jar Pi is 


oN N2a'y' — Da’ dy’ 
VINZ — (22N By? — (Œy) 
(Recommended general formula for prod- 


uct-moment correlation when measures (43) 
are taken from an assumed mean) 


The x’s and the y’s may be taken from any mean the worker 
pleases, and the resulting r will be not only approximately 
but absolutely the same; or the scores may be taken exactly 
as they stand, which amounts to taking them as deviations 
from zero as an assumed mean. To take them thus as original 
scores saves all subtractions and all necessity for watching 
algebraic signs (unless the original scores themselves involve 
differently signed numbers). But the numbers will be larger 
than if we work from a mean near the true mean. When working 
with a Monroe calculating machine, we always use this last 
formula, taking the x’s and the y'’s in terms of the original 
scores, because the formula taken in this manner fits the calculat- 
ing machine ideally and large numbers are no handicap in working 
with a machine. But the worker who is operating with a pencil 
may prefer smaller numbers, even if he must bother with plus 
and minus signs, and hence will wish to work from an assumed 
mean as close as feasible to the true one. But the formula is 
precisely the same in either case. We recommend the last 
formula given above and the basic formula (for measures taken 
as deviations from the true means) as the only product-moment 
correlation formulas worth the student’s effort to remember. 
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Tasun IX.—Scorns on ODD-NUMBERED AND EVEN-NUMBERED ITEMS OF A 
Testr iy Epucationan Psycuonocy BY 106 CoLLeGe STUDENTS 
Arranged in a correlation table 


Odds 
Evens | 10-15-20-25- 30-35-140-45-50-155-160-165-170- NEIE fe 
14| 19| 24| 29| 34| 39| 44| 49| 54| 59| 64| 69| 74 
65-69 2| 2/11] 22| 242 
60-64 1 1\10| 10} 100 
55-59 paa 4l 9| 36| 324 
50-54 2 2 al 8| 32| 256 
45-49 1 Ha aa 6| 7| 42| 294 
40-44 1) 1 3} 3] 2} 2 12) 6| 72| 432 
35-39 I shea B AT A dl 13| 5| 65| 325 
30-34 5| 6| 6 9 1 27| 4/108} 432 
25-29| 1] 1] 1] 5] 5] 5) 2) 1 21| 3] 63| 189 
20-24] 1 il 3l 2| 2 9| 2| 18| 36 
15-19/ 1] 1) J} 1 1 Bt) 5} 5 
10-14 1 1 2} 0} o o 
fe | 3} 2 5 16) 18| 15| 18) 8} 8| 7 3} 1) 2106| |473/2,635 
x | of al 2} af 4} 5| 6 7 s| 9-10) 11| 12 
Jx | o| 2| 10) 48| 72| 75/108) 56| 64! 63| 30| 11| 24 
7X? | o| 2| 20/144/288|375/648/392/512/567/300|121|288| 
zY. | 6| 4| 12| 53| 65| 49| 82| 45| 49| 49| 28| 9| 22 473 
2XY,| 0| 4| 24|159|260)245|492/315|392/441/280] 99/264 


ie N=XY — 2X. 2Y 
VINEX: — SX’) (N 2Y? — 2Y’) 
i 106 - 2975 — 563 - 473 ona 
a06 - 3657 — 563°) (106 - 2635 — 473%) 


The product-moment formula for correlation holds exactly 
only for scores taken pair by pair; when an r is computed from a 
correlation chart, it loses somewhat in precision. However, 
if the number of cases is 40 or more and the number of categories 
for each of the arrays is reasonably large, the loss is sufficiently 
small to be negligible. But one should never compute an r from 
a correlation chart where the number of cases is less than 30 
or 40 unless the range of scores in an interval is only one or two. 
The number of categories should be about 12 or more. Ifan 
r must be computed from a small number of categories, Shep- 
pard’s correction should be made in the standard deviations that 
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constitute the denominator of the fraction (see page 89). We 
shall in a later chapter more fully discuss the problem of cor- 
recting r for a small number of categories. ` 


THE SUMS AND THE DIFFERENCES FORMULAS FOR r 


Sometimes it is very convenient to employ a formula for r 
that involves adding or subtracting the paired scores instead 
of multiplying them. We shall develop formulas for that 
purpose. 

Let d be the difference between any two paired scores when the 
scores are expressed in terms of deviations from the means of 
their respective series. Then 

we E(x — y)? a Z(z? — 2ey + y?) _ 2r? Lay 4 Dy? 

N N N N N 
Multiplying both numerator and. denominator of the middle 
term by oz cy and putting each of the other terms in the form of 
equivalent o’s, 


22ry 
o=o to Gi Nom y 
z 


But the final term now contains the formula for r. Substituting 
r for its equivalent, then transposing and solving for r, we have 


o} = o + 02 — 2rozsy; 2rowy = 02 + 72 og 
2 < a | 0 i i 
ata Ce oS ria E 
‘T2Fy terms) 

We arrived at this result by taking all our measures as devia- 
tions from the means of their respective series. But we can 
easily show that the same formula holds if we work with the 
difference between raw scores instead of the difference between 
deviation scores. 2 and o? will be the same, of course, regardless 
of whether we computed them in terms of deviations or in terms of 
raw scores, provided we made the proper correction on account, 
of taking zero as the assumed mean. We need only show that, 
if d is the difference between paired scores when these scores are 
in deviation form and D is the difference between corresponding 
raw scores, then ca equals op. 


dm(am ge E M) — 0 — My] 
=X-—Y-—(M,— M,) =D — (M: —M;) 
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ca will necessarily be the same as op because in the latter case 
each item will merely have a constant (M,— M,) subtracted 
from it to make it identical in value with its corresponding d, 
and the subtraction of the same value from each term of a series 
does not affect the standard deviation of that series (see page 
77). We may, therefore, write our formula for r as follows: 


oz +05 — 05 (Formula for r in terms of the differ- 


nce 2010y ences between raw scores) 


If the variabilities of the two arrays are equal, as would be 
approximately true of two forms of a test, this formula simplifies 
to the following: 


EI i ea oR 1 of 
r=3(S48 TaN tt ge 
yan ee oh (Formula for r in terms of differences, assum- (45) 
2g? ing equal variabilities in the two arrays) 


If, finally, the means are equal as well as the variabilities, 
we can simplify formula (45) a little further by considering the 


‘ ze - (3?) 
SDR 
But 


BD = 3(X — Y) = (2X — SY) = (NM, — NM,) 
= N(M- — M,) 


2 
op. 


But, if the means are equal, the difference between M, and M, 
is zero and SD becomes zero. Therefore (2D/N)? becomes 
zero and o% equals 2D?/N. Substituting this value in formula 
(45), we get 


3D? (Formula for r in terms of differences 
esl pes between paired scores, assumin; (46) 
2No? equality of variabilities and of 


means in the two arrays) 


This formula can be applied in certain practical situations, 
particularly in the correlation between two forms of the same 
test or two halves of the same test, but always with a certain 
risk that the assumption of equality of means may not hold. 
However, it is especially useful to us just now because it is the 
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basis for the development of the important Spearman ranks 
correlation formula which we shall treat presently. 

The reader will be easily able to verify the fact that, if we had 
taken X + Y = S, we would have arrived by a similar procedure 
at a formula for r in terms of the sums of paired scores as follows: 


Meee Oz Fy (Fomula for r in terms of sums (47) 
of paired scores) 


and if the variabilities may be assumed equal, 


o3 (Formula for r in terms of sums of 
r= Zg = 1 paired scores, assuming equal (48) 
io variabilities in the two arrays) 


Occasionally we may have standing on our records an average 
between the paired scores instead of the sum of the scores, and 
we may wish to employ these data to get a coefficient of cor- 
relation between the two arrays from which the averages were 
taken. This would be the case, for example, where a teacher 
had entered in her book a mid-term grade, an end-term grade, 
and a final grade that was the average of the two and wished later 
to learn what had been the correlation between the grades 
for the two halves of the term. Since all measures are half 
as great as when couched in sums instead of averages, the 
variance will be one-fourth as large as that of thesums. 3 equals, 
therefore, 40,2, and 


doa 105 = oj (Formula for r in terms of the aver- (49) 


iS Qox0y ages between paired scores) 


THE SPEARMAN RANKS FORMULA FOR CORRELATION 

We shall next develop a formula for computing a coefficient 
of correlation between two series when put in terms of the rank 
order of the items instead of raw scores. Thus we may know 
about a set of pupils only the order in which they rank in history 
and the order in which they rank in geography, and yet we may 
wish to compute a coefficient of correlation between standings 
in these two subjects. Or, even if we know the actual scores, 
we may prefer to translate these scores into rank orders and 
then compute the correlation coefficient from the ranks, on the 
ground that the mathematics involved is somewhat simpler. 
The formula that we shall treat for this purpose was first devel- 
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oped by Spearman, the method is called the Spearman ranks 
method, and the symbol for the coefficient of correlation thus 
derived is designated by the Greek letter p (rho). 

Our starting point in the development of Spearman’s formula 
is our formula (46) above; for in the case of two sets of con- 
tinuous ranks with the same number of individuals in each set 
it is clear that the means of the two arrays would be equal and 
so would the variabilities. However, our scores have now become 
ranks, so that D is now the difference between the ranks of an 
individual item in the two series. We might, with any given 
problem, apply in the customary way this formula just as it 
stands. But in the special case of ranks we can put it in a much 
more convenient form by getting a simpler equivalent for the 
o? of the denominator. 

This o? is the square of the standard deviation of a set of n 
continuous ranks, 


A (1242+83 +4+.-. +n’) 


Orka n 


~ (2 tate tn) 


n 


For the sake of an abbreviated notation we may use =n? to 
represent the sum of the squares of all numbers to n, 


(+2 +384 +++ $n, 


and =n for the sum of the numbers 1 to n. Our formula for the 
standard deviation of n continuous ranks will then stand as 


A In? Zn\? 
PE nE i ) 
Our hardest job will be to get a value for 2n*. We shall 


attack that first. Let us write down the following identity: 
(n + 1) — w = (n3 + 8n? + 3n + 1) — n 


or 

(n + 1) — n? = 3n? + 8n+1 
This statement is true for all values of n, since it is an identity 
by selection. Therefore, we may replace n by (n — 1) and still 
retain an identity. That is, we shall have 


[(m — 1) +1} — (n — 1} = 8m — 1)? + 3(n — 1) +1 
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or 
n? — (n — 1} = 3(n — 1)? + 3(n - 1) +1 


If again we replace n in this expression by (n — 1), we shall 
obtain the identity 


(n — 1} — [(n — 1) — 1} = fm — 1) — 1P + 3i — 1) 
—1]+1 

or A 

(n — 1} — (n — 2)! = 3(n — 2)? + 3(n - 2) +1 

etc. In general, then, we shall have a set of statements which 

are identically true. These statements may be written as follows: 


(n + 18 — n = 8n? + 8n +1 
n — (n— 1} = 3(n — 1)? + 3(n — 1) +1 
(n — 1} — (n — 2} = 3(n — 2)? + 3(n — 2) 
(n — 2} — (n — 3} = 3(n — 3)? + 3(n — 3) 
(n — 3)? — (n — 4) = 3(n — 4)? + 3(n — 4) 
In — (n — 2) — [n — (n — DP = 3[n — (n — DP 
+ 3[n—(n- 1] +1 


+++ 


8 — = 3212+ 38-1--1 


Now if we add these identities we shall, of course, obtain as 
the sum another identity. Making the addition, we notice 
that certain terms cancel each other. On the left side of the 
equation we shall have uncanceled only the first and the last 
terms. On the right none will cancel. But in the first term 
the quantity in parentheses starts at n and decreases by 1 down 
to 1, so that, when we add the first terms for all the equations, we 
shall have 32n%. For the same reason we shall have for the 
second term on the left 32n. There is in the third term a 1 
for each of the n equations, so that the sum of these will be n. 
The summing will, therefore, give us the equation, 


(n + 1} — 13 = 32n? + 82n +n 
or “ 

n? + 3n? + 3n = 3En? + 38in+n 
Dn equals, of course, half the sum of the first and the last term 
multiplied by the number of terms, t.e., 
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Sn = n(n + 1) 

2 
Substituting this value, transposing so that we may have on 
the left side of the equation the Zn? (for which we are seeking 
a value) and all other terms on the right, clearing of fractions 
by multiplying through the equation by 2, and then factoring, 
we have the following: 


n + 3n? + 3n = 3En? + 3n +n 
3n? = n? + 3n? + 3n — pee =n 


62n? = 2n? + On? + 6n — 3n(n + 1) — 2n 
62n? = 2n + 6n? + 6n — 3n? — 3n — 2n 
62n? = 2n? + 3n? + n 
62n? = n(n? + 3n + 1) 
62n? = n(2n + 1)(n + 1) 

En? = n(2n + 1)(n + 1) 
6 

Repeating, now, our formula for the variance of a set 
of ranks, substituting in it the values of =n? and of =n which 
we found in the above process of reasoning, and simplifying 
algebraically, we have 


WNE zn? _ (2n z My n(2n + 1)(n +1) _ n(n + 1)? 
Hig n n 6n 4n? 
(m t+3n +1) _ (n?+2n +1) 
E 6 4 
_ 4n? + 6n + 2 — 3n? — 6n — 3 
ah 12 
Nelle! Formula for the standard deviati 
oh, = 5 aT of a bol at aan TT ie) KO 


We are now near the end of our development. We shall 
repeat formula (46), from which we started, but shall substitute 
p for r in recognition of the fact that we are dealing with ranks 
instead of raw scores, and then simplify our formula algebraically. 


ania De fe _=D? 1 SD 
ET 2No%,, 2N(N?—1) >  ~“N(W?— 1) 
12 6 
2 
Hil 62D (Correlation formula from ranks) (51) 


~ N(N? — 1) 
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From this formula, it may be remarked parenthetically, we 
may easily derive a general formula for the standard deviation 
of any rectangular distribution. If we regard the length of 
the rectangle as divided into n equal intervals, the length of 
the rectangle will be the sum of these n divisions. If a is the 
frequency in each interval (in a rectangular distribution the 
frequency must be the same in each interval), the standard 
deviation will be, when squared, 


z Zan? _ (2a) ain? adn _ in? (4) 

RA an an an an? n n 
which is just what we had to begin with in the above develop- 
ment. The standard deviation of the rectangular distribution 
will, therefore, be v/m? — 1)/12. Now, if we let the number 
of subdivisions increase indefinitely (by allowing our intervals 
to become indefinitely small), so that we eliminate the inaccuracy 
resulting from grouping the contents about the mid-points of 
intervals instead of taking the items in their proper places, the 
1 will become negligible in comparison with the 7. Conse- 
quently, if we replace n? by s? as the limit is approached, (n? — 1) 
will approach s?, and we shall have 


3? 1 (Formula for the standard deviation 
Trea = Je eat la of any rectangular distribution) (52) 
Thus the standard deviation is vyz times the length of the 
rectangle. 

Returning to our principal development, it will be observed 
that the formula for p is merely a transformation of the formula 
forr. The p is therefore substantially equivalent to r. The two 
would be identical if it were not for the fact that something is 
lost in accuracy when translating scores into ranks, because ranks 


are equally spaced while scores seldom are. Pearson has given 
a formula for translating p into r. It is as follows: 


r= 2sin @) p (Formula for translating p to 7) (53) 
The r here is in radian units for measuring an angle and is 
equivalent to 180°. This 180° divided by 6 always gives 30°, 
so that one needs each time to multiply 30° by his obtained 
p, to look up in a set of trigonometric tables the sine of the 
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resultant angle, then to take r as twice this sine. Suppose p 
turns out to be 0.10. This 0.10 times 30° equals 3°. Looking 
up in a table of trigonometric functions the sine of 3°, we find it 
to be 0.05234. This multiplied by the 2 called for in our formula 
gives the value of r to the nearest third decimal place 0.105. 
Suppose p is 0.50, Thirty degrees multiplied by 0.50 gives 15°. 
The sine of 15° is 0.25882, and twice this is 0.518, which is the 
equivalent value of r. If p is 0.97, this times the 30° gives 29.1, 
which equals 29° and 6’. The sine of this angle is 0.48634, and, 
consequently, 7 is 0.973. 

In many texts in statistics, tables are printed giving the 
equivalents in r for each value of p. But Pearson’s correction 
formula is based upon the assumption of a large normal distribu- 
tion in each of the correlated series in which the scores cor- 
responding to the ranks are found. The distributions from which 
the ranks for the computation of p are nearly always taken are 
very small and rarely if ever completely normal. Pearson’s 
correction would seem then, to be inapplicable in the practical 
situations in which we employ p; although on the average from 
many applications we would be nearer the truth with the applica- 
tion than without it, in any particular case we could not be at 
all sure that we were nearer the truth after we had applied it 
than before. Besides, the correction is never greater than 
0.018, and that is practically always well within the probable 
error of the correlation coefficient. p is often considerably in 
error as compared with r, due to loss in accuracy in the translation 
of scores into ranks, but this is a loss that can never be regained 
by this correction formula or by any other. We advise, therefore, 
making little or nothing of the difference between p and r and 
only employing the symbol p to indicate that the correlation was 
computed by the cruder ranks method. 

The ranks method may be employed where the number of 
cases is small, since with a small number of cases any coefficient 
of correlation shows only roughly the degree of actual relation 
between the areas sampled and the erude ranks method serves 
the purpose sufficiently well. But, whenever the number of 
pairs goes beyond about 30, the labor of translating scores into 
ranks is greater than the saving from the simpler mathematical 
processes involved in the ranks formula. For that reason the 
Spearman ranks formula is not advised except perhaps where 
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N is less than 30, unless the original data are ranks instead of 
scores. But this is only a matter of convenience. There is no 
truth in the idea that the Pearson product-moment formula 
is less applicable to a small number of cases than the Spearman; 
the two formulas are merely different algebraic forms of the same 
thing, so that, wherever the one is applicable, the other is also 
applicable. . 

There is another correlation method based on ranks that is 
usually treated in books on statistics which we shall mention 
here merely to dismiss. It is called the foot-rule method and the 
coefficient derived by it is represented by the symbol R. It is 
not equivalent to the Pearson product-moment r but requires 
tables for translation. It seems to have no merits to recom- 
mend it, not even the ease of computation that some persons 
claim. We do not recommend that research workers learn it. 


ASSUMPTIONS ABOUT THE SHAPE OF THE DISTRIBUTIONS 
IN THE r FORMULA 

If the reader will recall our calculus development of the 
Pearson product-moment formula for r, he will notice that no 
assumptions whatever were involved regarding the shape of the 
distributions in the correlated series. The formula is general; 
it holds for distributions of any shape. 7’s may be computed 
between two series of percentiles, in spite of the fact that these 
make a rectangular distribution. If the regular product- 
moment formula were applied to two sets of correlated ranks, 
the resulting r would be identical with that gotten by the Spear- 
man method. The regular product-moment formula may 
be applied where one array is given in scores and the other in 
percentiles or in ranks. Kelley recommends! that, where one 
set of scores is known only in terms or ranks and the other in 
actual scores, a regular Pearson r be computed between the two 
series as they stand rather than lose further accuracy by trans- 
lating the known scores into ranks. The Pearson correction 
formula, where p’ has been calculated between a set of scores 
and a set of ranks, is 


ne AL p's = 1.0233p! (54) 


1 Kouer, T. L., Statistical Mi ethod, The Macmillan Company, 1923, p. 
194. 7 
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But the correction here, too, is so small that it is scarcely worth 
making unless the N is large. 


PREDICTING IN TERMS OF THE REGRESSION EQUATION 


Of what use is a coefficient of correlation when computed? 
There are two uses it may serve: (1) to enable us to infer the score 
of an individual in a second array from our knowledge of his 
standing in a first array with which this second one is correlated 
by a known amount; and (2) to express the degree of interrelation, 
of concomitance, of community, between two systems of vari- 
ables. We shall first consider the former of these applications. 

At the opening of this chapter we saw that, if two series of 
paired scores are related, the scores in the one series may be 
treated as a function of those in the other. Also, if the relation 
is the simple rectilinear one with which we are dealing in this 
chapter, any y score will be a certain multiple of the correspond- 
ing x score, provided each is measured from the mean. This 
multiplier we called b, and for it we found a general formula 
b= a This b is, as we said, the slope of the line that best 
fits the trend of the paired measures. It is called the regression 
coefficient. If it is the slope that the straight line makes with 
the v axis when it passes through the successive x values in such 
manner as best to fit the corresponding y measures, we call it 
the coefficient of the regression of y upon x and designate it by 
bye. If it is the slope that the line makes with respect to the 
y axis when it passes through the successive y values in such 
manner as best to fit the corresponding x measures, we call it the 
coefficient of regression of x on y and designate it by, bzy. 

Our development at the opening of the chapter showed that, 
when our terms are taken as deviations from the means of their 
respective arrays, 


Bis Zay _ Zzry — Pty Ty _ Tey oy 
3a? Noi? = No? cusy Nowy oz 
But the first part of this last term contains the formula for r. 


Substituting r for this value, we have for our regression coeffi- 
cient of y upon z, 


o 
bye =r (Regression coefficient, y on z) (55) 
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If in our calculus development we interchange y and 2, so that 
we compute in terms of y, we get, 
_ Zy _ os 


=r (Regression coefficient, z on y) (55a) 


ta Syna 


We may now revert to our original equation for a straight line 
7 = bysx and substitute it in the value that we have just obtained 
for by: 

Rizr % . x; and similarly % = r oy; ei “in devia (56) 

Oz Oy tion form) 

These are the regression equations in deviation form. From 
the former we can predict the most probable y score for an 
individual from a knowledge of his x score, and from the latter 
we can predict his most probable x score from a knowledge of 
his y score. Suppose, for example, we know that the correlation 
between intelligence test scores earned at entranee to college and 
“point averages” attained in college is .40 and that a certain 
boy makes a score of 18 below the mean in the intelligence 
test. The o of the intelligence test scores is, we shall say, 30 
while that of the “point averages” is 0.60. What point-average 
attainment may be expected of him? The anticipated point 
average is the y; the other elements in the equation we know. 


AEE E SIS 
g: = 40 5g (718) = —0.144 


The boy would, therefore, have indicated as his most probable 
score 0.144 below the mean. 

So far we have been dealing with the regression equation 
in deviation form. If we prefer, as normally we would, to handle 
it in terms of raw scores, we need only put for x; its equivalent 
(X: — M.) and for g; its equivalent (F: — M,), where the M’s 
are the means. Our regression equation will then be 


(Yi- M,) = 72 (X: — Mə); Y;= ia (X: — M) + M, 
= Oy ov M. (The regression equation 57 
Yis ne Aani Mi= rta z in score form) ( ) 


The one for X would be symmetrical with this one for Y. 
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Let us now illustrate the operation of this “score form” 
regression equation. Let us say the mean of our intelligence 
test scores is 100 and the ø is 30, the mean of the point averages 
is 1.40 and the ø is .60, while the correlation between the two 
series is 40. A certain boy makes a score of 82 in the intelligence 
test. What may he be expected to earn in point-average 
achievement? 


5 -60 -60 
Hs = 4035 82 + (1.40 = 4035 100) 


Y: = .656 + 1.40 — .80 = 1.256 


These data are intended to be for the same case as those of 
our illustration in the deviation form, and the result is the same 
in meaning as that we obtained there, as the reader can easily 
verify. The worker will likely be computing expectations for 
many individuals at one sitting. The value in parentheses 
will be the same for all cases and may be worked out once for 
all as far as a particular set of data is concerned. Likewise the 
r(cy/oz) may be computed once for all the needed applications. 
The routine computations for individual cases will then be very 
easy. 

Whenever we employ one measure as a criterion of probable 
standing in another, we are interested in the regression equation 
as a tool for making our predictions for individuals specific 
rather than general. We employ aptitude measures, particu- 
larly, in this manner: general intelligence tests, special prognosis 
tests, measures of social status, of character traits, etc. We have 
more or less vaguely in mind the element of prediction, too, 
when we are concerned with the reliability of a test, for what 
we have at stake is the question of how nearly individuals may 
be expected to make the same scores upon repetition of the test. 
In connection with prediction of scores the question comes up, 
then, as one of major importance for us: How accurately can we 
predict by the use of a regression equation? This extremely 
important problem we shall discuss next, and we shall develop a 
general formula for the standard error of such estimates. 


STANDARD ERROR OF ESTIMATE 
Whenever a score is predicted for a particular individual by 
means of the regression equation, it is predicted as lying on the 
regression line. But an inspection of Fig. 14 and Table IX 
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will show that, in any problem where we do not have perfect 
correlation, by no means all of the actual y measures that cor- 
respond to a given x value lie on the regression line; they scatter 
considerably above and below this line. That phenomenon of 
scatter is, perhaps, most obvious when we examine a correlation 
chart where the data are grouped into intervals, as is the case 
in Table IX. As we go out along the X axis, we find a series of 
columns approximately normal in shape with their means at 
successively higher y values, and the regression line passing 
near the center of each column. The y value calculated to 
correspond with a given x value would lie at the point in the 
column where the regression line crosses. The fact that the 
column scatters from this point shows that many of the calculated 
g’s miss the actual values to the extent of the scatter of the 
columns. We wish to get a measure of the extent of these 
errors in estimating a y score from a known z score by means 
of a coefficient of correlation. We shall, therefore, compute a 
standard deviation of these “misses.” This is called the standard 
error of estimate, and its symbol is oa. 

Let g be a predicted score and y the score that turns out in 
fact to be the one paired with x. Then (y — J) will be the 
“error” in this particular case. Remembering that our 2’s 
and our y’s are being taken as deviations from their respective 
means and that (y — g) will be in deviation form if the relation is 
a rectilinear one, since 7 will then be the mean of the column 
in which it occurs, and remembering our value of g from the 
regression equation, 


2 Da? 
2%. 
Ca UN Aine TSN, 

In this we have the equivalent of o? and of coż. If we multiply 
both numerator and denominator of the middle term by oy; 
we shall have for the part of the term containing 2zy the formula 
for r. Therefore we have 


Iry o 
2 2 Y 2 
Cary = Fy — 2ra; 5 sigan 2° 0z 
Nozs z 
. = pe 2 — g? — r? 
2 = 08 — 27763 17055 Can = Gy 702 = (L r’) 


Tan toy) Lire, (Standard error of estimate) (58) 


o? 
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A probable error is 0.6745 times as great as a standard deviation 
(in a normal distribution, which we assume here). Therefore 


P.E. = 0.67450 VI — 73, Se ea ey BB), 


Let us now illustrate the application of this formula. We 
shall employ the same data as used previously in this section 
(page 111). 

For the boy of our illustration a point average of 1.256 was 
predicted on the basis of his intelligence test score, where the r 
was taken to be .40 and the oy (standard deviation of the point 
averages) to be 0.60. How accurate is the prediction? 


Ca = 0.60°/1 — 49? = 0.600/1 = .16 = 0.60/.84 = 0.55 
P.E. = 0.6745 - 0.55 = 0.371 


This last value means that the chances are 50 in 100 that a 
student’s actual point average will not differ from his predicted 
one by more than 0.371 but that, conversely, they are also the 
other 50 in 100 that the score will be missed by more than that 
amount. The value of ow means that in approximately two- 
thirds of the cases we may expect to find our prediction in error 
by 0.55 or less, while in the other third our errors may be greater 
than that amount. In the case of our particular boy the chances 
are 1 to 1 that his point average will not be found to go above 
1.627 or below 0.885, while they are 2 to 1 that it will not goabove 
1.806 or below 0.706. 

That is really not very accurate predicting. It requires a 
very high coefficient of correlation to enable us to forecast the 
standing of individuals with reasonable accuracy. If we had 
no means of predicting a score at all, but merely drew for indi- 
viduals scores at random, the @ scores for any given x value would 
scatter purely by chance; t.e., they would scatter for any x value 
to the same extent to which the scores of the whole test scatter. 
To the extent to which a correlation coefficient instead of chance 
guides us in predicting scores, to that extent the actual ¥ scores 
for a given x value will have a smaller scatter, and with a perfect 
correlation coefficient as a guide they will all lie exactly on the 
regression line; there will be no scatter at all. Where the y scores 
are collected into columns, as they are in Table IX, this relation 
is very obvious. The length of any column shows the extent 
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of error in the prediction, and its shortness as compared with the 
column of totals at the extreme right indicates the improvement 
we have made over chance by reason of the guidance afforded 
by the correlation coefficient. We may conveniently make a 
ratio out of the scatter of a column and the scatter of the whole 
distribution, and this ratio will show the proportion of chance 
still remaining in our prediction. We shall call this ratio k. 
Then 


— 7? 
p= WOT Vie 
Y 


This ratio V1 — 7° Kelley has named the coefficient of alienation. 
It furnishes a very fruitful way of looking at a coefficient of 
correlation and of passing judgment as to how high a correlation 
must be in order to be satisfactory. The student should apply 
this test to coefficients of various sizes. He will find that, 
where r equals .10, there remains 99.5 per cent of guess in a predic- 
tion based on it; the prediction has been improved only one-half 
of 1 per cent over pure chance. Where r is .80, 60 per cent of 
chance still remains; we are only 40 per cent better off than if 
we drew predicted scores out of ahat. Even where ris .95, there 
remains 31 per cent of the element of chance in predicting place- 
ment of individuals. It is obvious that for the safe placement of 
individuals very high correlation coefficients are required—much 
higher than those called high by Rugg and others. 

If we are concerned with the prediction of averages for a 
group rather than with scores for individuals, much lower 
correlations may serve our purpose. We shall later see that the 
standard error of a mean of a group of y scores predicted on the 


0; 
basis of a correlation is UN v1 — r. Thus the error of 


prediction of means of a class of 100 members would be only 
one-tenth as great as that involved in the prediction of standings 
of individuals. 

The high residual scatter of y scores for a single x range, as 
shown by the coefficient of alienation, shows that we can place 
a subject only very roughly in a second measure from a knowledge 
of his score on a first measure, unless the coefficient of correlation 
between the two measures is very high. But a further considera- 
tion will show that we can have considerable assurance against 
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expectation of extreme shifts even when guided by relatively low 
7s. Let us, therefore, ask what are the probabilities of reaching 
certain critical positions in a second array when position in a 
first array and the r between the arrays are known. Let us 
return to the case of the hypothetical student of our above 
discussion who made 82 on the intelligence test and ask what are 
his chances of making honors in college, which requires a 2.5 
average or better. He belongs to a subarray of students for 
whom the predicted mean is 1.256. The standard deviation of 
this subarray (column in the correlation chart) is the standard 


error of estimate, 
Te = oyV 1 — r? = 0.55 


He aspires to reach a point average of 2.50, which is 1.244 points, 
or 2.26 of the standard deviations of his subclass, above the mean 
of his subclass. Will anybody in his subclass rise as high as that? 
Yes, whatever percentage in a normal distribution lies 2.260’s 
above the mean. Reference to the table, page 486, shows that 
0.0119 will do so. Thus he has about 12 chances in 1,000 to 
make honors. But, conversely, there are in the subclass 0.9881, 
or 98.81 per cent, who fall below that critical point; so that there 
are about 988 chances in 1,000 that a student making his intel- 
ligence test score will not reach the honors level. The odds, 
therefore, that he will not make it are 83 to 1 (obtained by 
dividing the chances against by the chances for). Suppose, 
next, we raise the question: What are the odds that he will make 
the minimum for graduation, a point average of 1.00? This is 
below the predicted mean for his subclass by 0.256 points, which 
is 0.460. The probability that he will fall below this point is 
0.3228 and the probability of being above it is 0.6772, so that 
his chances of making the minimum grade of graduation are 
roughly two to one. 

We give, on pages 508 to 510 of the Appendix, a table showing 
the chances in 1,000 of passing from each tenth in a criterion 
array to each tenth in a predicted array for r’s by .10’s from 
05 to .95. Of course, for an r of 1.00 the prediction would 
be perfect. The prediction is made from the mid-point of 
each tenth in the criterion to just across the border in the depend- 
ent array. Thus, the chance of shifting from the 4th tenth in 
the criterion to the 6th tenth in the dependent array means the 
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chances in 1,000 that a student who stands on the 45th percentile 
in the former array will be found above the 60th percentile in 
the latter array. 


r IN TERMS OF COLUMN VARIANCE 

The reader is asked to think again of Table IX, where the 
data of the correlation table are gathered into columns. These 
columns tend to have the same degree of scatter. Equal vari- 
ability in the columns of a correlation table is called homo- 
scedasticity, and the assumption of homoscedasticity is often 
made in the development of statistical formulas. If we assume 
it here, as we have been doing in this section, we can take the 
standard error of estimate to be the standard deviation of any 
one of these columns. We can then make some interesting 
algebraic transformations. Let us call the standard deviation of 
any column øe. Then 


Ge = oyV 1 — 173 of = a(l — r°) = of — r'o? 


2 2 2 2 
o; a g; o; 
To= o ortal S37 = all i; 
o; oè oè 
Cy Y y Y 
g? 
r=] —— (Coefficient of determination) (59) 


This makes obvious the relation between the scatter of the 
columns and the coefficient of correlation. Conceivably we 
might compute an r from this formula. However, the fact that 
ce is the standard deviation of the column from the regression 
line as origin rather than from the mean of the column makes the 
computation of an r in this manner impractical unless perfect 
rectilinearity of regression may be assumed. But we shall later 
see that another measure of correlation, 7, makes use of just this 
procedure, except that there we compute s+ from the means of 
the columns. 

An equation involved in the above development puts us in 
position to prove that an r must always lie between +1.00 and 
and —1.00. You will find above the expression: o? = o3(1 — 72). 
Now o? (if it differs from zero) must be positive in sign, since any 
a? is made up of squared measures which are always positive if 
the measures are real. For the same reason o? on the right-hand 
side of the equation must be positive. Therefore the (1 — r?) 
must be positive (or zero), since otherwise the sign of the product 
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on the right would have the minus sign and o? would need to be 
a negative quantity. But (1 — r?) can be positive only if r 
does not go above +1.00 or below —1.00. Therefore r must 
‘lie between plus 1 and minus 1 inclusive. 

We have now shown how a coefficient of correlation can be used 
in the prediction of scores in an array that is not yet in hand but 
with which the correlation of a criterion is known on the basis 
of past experience, and we have seen the limitations of accuracy 
of these predictions in terms of the standard error of estimate 
and of the coefficient of alienation. We have next to discuss 
the other use of an r, viz., to express the degree of community 
between two sets of data. 


DETERMINING AMOUNT OF COMMUNITY BY CORRELATION 

In analyzing the “inductive methods” employed in scientific 
inquiry, John Stuart Mill listed as one of them the method of 
concomitant variation. When the height of the mercury column 
in a thermometer rises as the temperature becomes higher and 
falls as the temperature lowers, a causal relation between these 
two concomitantly varying phenomena is indicated. If certain 
electrical disturbances are found on the earth simultaneously 
with spots on the sun, not only occurring simultaneously with 
the latter or else with a fixed lag but also varying in intensity 
as the latter vary in intensity, a causal connection is likewise 
indicated. Correlation in statistics is similarly merely a mathe- 
matical method of making more specific this matter of con- 
comitance of variation between two sets of variables. It 
is, therefore, one of our methods of attempting to establish 
“laws” and to get at causal relations where the problem of 
isolating single variables is difficult or impossible. For a “law” 
is merely a description of concomitance of behavior between 
two or more factors when the existence and the nature of that 
concomitance have been supposedly infallibly determined; and 
causality is merely another name for such concomitance where 
we believe we know the direction of influence. In the physical 
sciences it is often (though not always) possible to isolate the two 
independent factors, and thus to determine the relation between 
them in a manner that will not vary from sample to sample. 
But in social phenomena we must ordinarily be content to let 
some irrelevant elements drag along mixed up with either or 
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both of the variables we are attempting to study, and these 
obscure the nature of the relation between our variates under 
study and cause the measurable kind and ‘amount of concomi- 
tance to vary somewhat from sample to sample. The correla-’ 
tion technique is a very powerful device for analyzing such 
concomitance. ; 

Since the measured concomitance between the variables will 
differ somewhat from sample to sample because of the presence 
of irrelevant elements which weight our scores, our first concern 
is to know whether there is in fact a real connection between 
our variables. When we study reliability in a subsequent 
chapter, we shall find that, even if there were in the total popula- 
tion a true correlation of zero, we would get 7’s differing from 
zero in samples, some positive and some negative, and that the 
standard deviation of this set of r’s could be estimated from the 
formula . 
mere 0 SS if 

VN-1 VN-1 
Thus, if the sample contained 65 cases, the standard error would 
be 1125. So an r as large as +.125 could be expected to occur 
merely by chance fluctuation from uncorrelated populations 
about one time in seven (1,587 in 10,000), and one of —.125 
equally often; one of +.25, 228 times in 10,000; one of +.375, 
13 times in 10,000; ete. So, if one has obtained from a sample of 
this size an r of +.125, or even of +.25, there is considerable 
risk in asserting positive correlation because the obtained r 
might have arisen merely by chance fluctuation. It is conven- 
tionally said that, in order to give assurance that there is a true 
correlation in the direction indicated by the sign of the one 
obtained in the sample, an obtained r should be at least three 
times as large as the standard error. We shall later show that 
this notion can be easily overworked. What we have is really 
different degrees of probability of an actual connection when the 
ratio of an obtained r to the standard error is certain amounts. 
If the sample is reasonably large and the ratio of the r to its 
standard error is 1, the odds are about 5 to 1 that there is a true 
r between the two sets of variables somewhat above zero in the 
same direction as that of the sample; if the ratio is 2, the odds 
are about 43 to 1; if 3, the odds are about 740 to 1; if 4, about 


Or 
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32,000 to 1; etc. Moreover, if several successive samplings 
give r’s with the same sign, the probability that there is a true 
correlation with that sign is greatly increased. Even if the r’s 
are prevailingly, though not exclusively, of one sign, an r with 
that sign is indicated with a reliability for the set that is likely 
to be considerably higher than the reliability indicated by the 
samples considered separately. We shall discuss and illustrate 
this point at length in a later chapter (pages 469 to 474). 

Our first concern, then, in studying the community between 
two sets of variables is to have assurance that there is a real 
correlation between them; and this we determine, as we have 
shown, by finding whether the relation of an r to its o is 
sufficiently high to guarantee this. Our second concern is to 
find some meaningful way in which to express the amount of this 
community. This we shall do by interpreting r in terms of the 
percentage of overlapping between the two measures. 

A Coefficient of Correlation as Proportion of Overlapping.— 
Suppose that a set of c elemental factors contribute to both scores 
x and y, while there are additional elemental factors, a, that 
contribute to x but not to y, and b factors which contribute to 
y but not to x. We would then have z = a + c and y=b+e. 
Factor ¢ is correlated with both x and y; but the other elements 
are independent of each other and of c. 

If we measure x and y as deviations from their respective 
means, the sums which equal these may be regarded as measured 
from their means and also the constituent addends from their 
respective means. Then 


= 2ty _ Uet+a)(e+b) _ Ee? + Seb + Sea + Dab 
piesa eee = 
Nosy Noc+a0e4b Noeyate4s 


But since c and a, c and b, and a and b are independent of one 
another, and since each is in deviation form so that when summed 
alone each yields zero, the sum of the products in each of the last 
three terms of the numerator would approach zero, and these 
terms would drop out of the equation. We would have left 


ee Tne O NCN. a 
Noctae4o TepaFe1h FeraT ep 


r 


It can easily be shown that, since the series a and c are uncor- 
related, se+a equals Vo? + o%, as follows: 
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Zc? Za? Ze? + Za? 
og = p and of = 4-3 whence os + oa = ae N 
Since a and c are uncorrelated and are in the form of deviations 
from their respective means, 2Zac will equal zero. Hence we 
can insert that value in the numerator of our fraction without 
changing its value. 


Ye? + 2Bac + Ya? (e + a)? 
ai +03 < PAO 


Therefore cepa = Vo? +02; and similarly ce = Vo? + ot. 


Making this substitution in our equation above, 


ki G 

" Vat + AVAF A i 
Let us now assume that o2 equals of. That is to assume 
that there are as potent factors accompanying the x variable 
that contribute to the total variance but not to the correlation 
as there are accompanying the y variable. This is not a violent 
assumption, but, even if it does not hold strictly true, that fact 
would not appreciably vitiate the conclusion we are about to 
draw. Making this assumption, the two terms under the 
radical sign become alike, and their product gives us one of them 
with the radical sign removed. Making this adjustment and 
then availing ourselves of the converse of the showing made above 

about the relation of o2,, to o3 + 02, we have 


as ae (60a) 


Let us call o? the variance due to the common factor and 
o2,, the total variance. Our results, then, mean that the 
coefficient of correlation between two arrays is that proportion 
of the total variance which is due to the common factor present in 
each test.* 


1 Compare Kelley’s development, Interpretation of Educational Measure- 
ments, pp. 193-195. In our development we took no cognizance of the 
possibility of imperfect measurements of x or y, which if present would 
vitiate some of our assumptions. Hence the finding holds strictly for r 
only when z and y are perfectly measured, t.e., for the “true” r, the r “cor- 
rected” for “attenuation,” 
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We may put this into more meaningful form if we make some 
fairly -well-warranted assumptions about the nature of our 
a, b, and c factors. Let us suppose that the c factors in any one 
item (score) are elemental units equal to one another in potency 
and any one of them equally likely in a given item to be present 
or absent. Let us make similar assumptions about b and a. 
These assumptions square readily with the behavior of “ deter- 
miners” in controlling traits and with the Mendelian laws of 
heredity. The number of c factors will then vary from item to 
item in such manner that they will make a normal distribution 
with the mode at half the maximum number. We shall learn 
in Chap. X that the standard deviation of a point binomial 
is y pqn. In this case both p and q are 0.50 and the n is the 
aggregate number of c factors in all the scores combined, which we 
shall call ne. Therefore 


oe = V0.50 + 0.50. = 0.50°/n, 


and similarly ca = 0.50V/na and o = 0.50\/m. By utilizing 
the principle, developed above, that sape = Vo? + o? we would 
have likewise oepa = 0.50 Vne + na, and oe» = 0.50°/n. + M. 


Substituting these values in our formula (60) for r obtained above, 


0.25n6 ne 
0.50 ne + na0.50/ ne + my Ve F na) (e + m) 
If now we assume that na equals m, which is similar to the 


assumption we made in our development above, our equation 
would become, 


(61) 


pa 
Ne + Na 


(62) 


Put in words this means that, if there is as much of the measured 
factor x that is not y as there is of measured y that is not a 
the coefficient of correlation between x and y expresses the 
percentage of overlapping between the two universes. 

Suppose now that b equals zero; i.e., suppose that all of y 
is included within x but not all of z is included in y. We would 
then haye 

ne Ne 


2 Ne Santa F 
S E Aa A e E in E 


(63) 
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That is, if all of y is included in x but not all of x is included 
in y, the percentage of overlapping is equal to the square of the 
coefficient of correlation between z and y. We would have the 
former condition fulfilled if some factors in measured intelligence 
contributed toward attainment in scholarship while some others 
contributed toward leadership, toward social graces, etc., but 
not toward academic scholarship; if scholarship, conversely, 
were due in part to the intelligence factor that the tests can 
measure but also in part to the social status of the home, to 
health, and to accidents of morale; and if the collateral factors 
in intelligence were equal in number to the collateral ones in 
scholarship. The r would then be the percentage of overlapping 
between measured intelligence and measured scholarship. We 
would have the latter condition fulfilled if all of study hours 
contributed toward scholarship but scholarship were due not only 
to study but to some other factors in addition. Here the square 
of the coefficient of correlation between study hours and scholar- 
ship would give the percentage of overlapping. 

Since the “true” r involved in our formulas is always a little 
greater than the r obtained from fallible measurements, while 
the square of the true r is likely to be a little less than the obtained 
one, and since the conditions obtaining in life are usually some- 
where between those of our two assumptions, the coefficient of 
correlation may be regarded as fairly deseriptive of the percentage 
of overlapping between the two universes correlated. The size 
of the correlation shows us the extent to which the factor x is 
adequate to account for the factor y—the percentage of the 
behavior of y that is attributable to z—or the reverse. 


Exercises 


1, From the data of Table X, page 124, determine the relation between 
size of school for defectives and the economy with which such school can be 
run, where economy is measured in terms of cost per pupil. 

2. If you have used the Pearson product-moment formula in Exercise 1, 
turn the scores now into ranks and compute p. Compare it with r. 

3. Compute the standard error of estimate for the r of Exercise 1, and con- 
cretely interpret its meaning. 

4. Compute r’s between one or more pairs of columns in Table IV, pages 
58 to 61. 

5. Recompute one or more of the r’s of Exercise 4 using only five intervals 
in each array. Compare with the r’s from the individual pairs of scores, or 
from 15 or more intervals. Make Sheppard’s correction in the os of your 
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TABLE X.—Pur CAPITA Costs AND ENROLLMENT IN 45 PUBLIC RESIDENTIAL 
Scnoots ror Duar CHILDREN IN THE Unrrep Srarzs! 


Per 

Name of school Enroll-| capita 
ment 
costs 
Clarke School for the Deaf, Massachusetts. .. 142 $1,147.10 
Pennsylvania State Oral School for the Deaf. 102 878.35 
New Jersey School for the Deaf... 360 847.70 
Columbia Institution for the Deaf. 207 823.67 
North Dakota School for the Deaf. ill 767.33 
Mystic Oral School, Connecticut. 122 754.49 
New York Institution for the Deaf and Dumb. 360 725.13 
Pennsylvania Institution for the Deaf.... 536 695.21 
Western Pennsylvania School for the Deaf, 303 656.92 
Iowa School for the Deaf...... Pen aeeeeeseeee teens 361 644.95 
Rhode Island School for the Deaf 96 644.86 
Northern New York Institution for Deaf-mutes 104 620,29 
Beyerly School for the Deaf, Massachusetts... . 81 619.96 
Institution for the Improved Instruction of Deaf-Mutes, N. 254 617.32 
Central New York School for the Deaf.......-..++.s+eeereee oe 116 611.77 
California School for the Deaf. 267 603.06 
Missouri School for the Deaf.. 307 590.59 
Florida School for the Deaf and the Blin 239 585.77 
Rochester School for the Deaf, New York. 220 570.28 
American School for the Deaf, Connecticut... 223 560.54 
Wisconsin School for the Deaf. 220 539.52 
Ilinois School for the Deaf.... 596 521,29 
Kansas State School for the Deaf.. 221 518.46 
Maryland State School for the Deaf. 175 514.61 
St. Joseph's Institute for Deaf Mutes, New York 410 512.45 
South Dakota School for the Deaf x 11 511.74 
Le Couteuix St. Mary’s Institution, New York., 226 470.37 
Minnesota School for the Deaf 324 461.24 
Nebraska School for the Deaf. 204 450,98 
Washington State School for the Deaf. 152 441,49 
Texas School for the Deaf 510 408.25 
Louisiana State School for the Deat 214 398.41 
Indiana State School for the Deaf. 443 380,00 
Oklahoma School for the Deaf. 393 382.72 
Kentucky School for the Deaf, 337 379,82 
Michigan School for the Deaf.,.. 477 365.65 
Oregon State School for the Deaf 120 360,22 
Ohio State School for Deaf, . 524 357.87 
Arkansas School for the Deaf 328 356.58 
Maine School for the Deaf.. 122 354,62 
Alabama Institute for the Deaf and Blind. 334 334.07 
Tennessee School for the Deaf. 334 308.16 
Georgia School for the Deaf. 261 305.69 
Mississippi School for the Deaf, . 259 299.23 
North Carolina School for the Dea: 


1 After S. G. Crayton, Bull., Univ. Ky. Bur. School Service, Vol, 7, No. 1, pp. 122-123, 
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r formula as applied to the five-category problem, and then compare your 
r with the one from narrow categories (see page 397). 

6. From a sample of 30 or 40 pairs, compute an r by the sums and by the 
differences methods and compare the convenience of these methods with 
that of the Pearson product-moment method. 
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CHAPTER V 
RELIABILITY OF STATISTICS 


STANDARD ERROR OF A MEAN 


Variability in Means.—When we take the mean of a group, we 
are customarily taking the mean of a sample out of a larger 
population. We may, for example, get questionnaire returns 
from 100 individuals declaring their several incomes and compute 
from these the mean income for the group. This we would 
characteristically wish to take as evidence of the average income 
of the whole population from which we drew the sample. Simi- 
larly we might wish to get some evidence of the extent of general 
information possessed by the high-school pupils of our city 
and might content ourselves with administering a test of general 
information to several hundred of these, on the faith that what 
we learned about these hundreds would be fairly representative 
of the whole city. Even if we test all the individuals of a given 
set, our findings still constitute essentially a sampling, since to 
get a complete picture of the situation, we would need to retest 
the group for all sorts of possible changes of conditions. 

As we draw other samples from our population, the obtained 
means are likely not to be precisely the same as the first one. 
From our second 100 respondents regarding income, and from 
the third, and the fourth, etc., our means would fluctuate some- 
what. The same thing is true wherever we employ samples, 
even where we retest the same group. It may be of much 
importance to know how great shifts to expect in further samples, 
since such knowledge would permit us to know with how great 
confidence to accept the mean we have in hand. One way 
would be to draw very many samples and to compute their 
actual means in order to learn empirically how stable these means 
are. But ordinarily that is not feasible; it is too expensive in 
time and money. But fortunately statistical principles permit 
us to infer the extent of this fluctuation theoretically from data 
furnished by our single sample. Such theoretically inferred 
standard deviation of means from possible further samples is 
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called the standard error of the mean to distinguish it from a 
standard deviation that has been empirically computed. In 
this section we shall develop the formula for this measure of the 
reliability of a mean. 

Development of the Formula for the Standard Error of a Mean. 
Let us conceive our measures as deviations from the mean of all 
the means of a great many random samples, say S samples. 
Then for the value of the deviation of the mean of any one 
sample from the mean of all the means we would have 


_ tit tat tet tt tt 
n 


Mı 


where the z’s are the individual measures and n is the number 
of them in the set. Squaring for this mean, 


CE E Ee a 


Mi z 
n 
Atete t -e + 2er + 2t t °t H2ras to 
n? 

We may write this 

n n n 

mM? = at +2 ae, tt), 
1 i=lj=1 


where the symbolism in the cross-products term means that each 
item, as £1, is combined with every other than itself in the sample 
and that each of the items, 21, T2, » + +» %iy + + + 37m is similarly 
thus combined with the others, all of them summed together 
constituting the tail of the expression, We shall have similar 
expressions for Mo, Ms, etc., though with what we must take to 
be different 2’s. The standard-error-squared of the means is 
the sum of all the squared means divided by the number of means, 
which we have agreed shall be S. Summing all these sets of 
values together, we have the formula 


where now the symbolism in the last term indicates that the 
cross products are kept segregated by samples in summing, but 
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the values within the several samples are then summed for the 
whole set. of S samples. 

In the conventional proof, followed by Kelley, Jones, and 
others, it is claimed that the tail of this expression amounts 
substantially to zero, the following theorem being cited as proof: 
The sum of products of measures which are independent of each 
other and whose means are zero, equals zero. But the proof is 
invalid because the theorem upon which it rests is inapplicable. 
For, on the one hand, the means of the measures within the 
several sets in which the products are obtained are not zero 
but Mı, M2, Ms, etc.; and, on the other hand, the products are not 
inclusive of all, since those of the type sızı are definitely withheld 
and included in the 2z”s. We shall resort to a more round- 
about, but mathematically defensible, development to prove 
that the tail approaches zero as a value only under certain 
conditions. Meanwhile, noticing that 2M2/8 is oå and clearing 
of fractions, we shall write our formula more simply as follows: 


8 


n Ss n, 
(4) BER = Dat 2D | 
t=1) 


We have said that S should represent many samples. — In 
order to exhaust the situation and thus perfect our development, 
we shall make § all the possible different samples that can be 
drawn from a total population of N taken n at a time. These 
samples must always be different, but the slightest possible 
difference will do—merely the change of a single x in the whole 
set of n z's. Reference to the treatment of the mathematics 
of choice in a textbook in algebra will show that the number of 
combinations of N things taken n at a time (consequently the 
numerical value of S) is given by the formula 


NW = 1)(N = 2)(N — 3)(W - 4) +--+ (W—n+1) 
n(n — 1)(n — 2)(n — 3) — 4) -1 


; 2a | E 


(B) 


Tn the set of S samples there will be, as implied above, duplica- 
tion of variates. How many duplications? Consider first 
the part of the expression containing DZz2, All the x2’s will, 
of course, appear in the summation as a whole, but not every one 
will appear in each sample. It will, however, appear as often 


1 KELLEY, T. L., Statistical Method, p. 84. 
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as combinations can be made of the other z”s taken so as to 
leave room for it; i.e., the number of times it will occur is the total 
possible number of combinations that can be made of N — 1 
things taken n — latatime. In all of our future manipulations 
in this chapter we shall use formula (B) as the basic formula. 
To learn how many combinations can be made of N — 1 things 
taken n — 1 at a time we need only substitute N — 1 for N and 
n — 1 forn. Doing this we shall have, as the frequency with 
which each of the «?’s will occur, 


(N — 1)(N — 2)(W — 3)(N — 4) + + > Wa=2+1) 
(n — 1)(n — 2)(n — 3)(n — 4) >- 1 
Since each of the x”s will occur this same number of times, we 
may use it as a coefficient for the summation of all the x”s. 
Thus the first quantity in the right-hand member of our equation 
will have the value 


(W — IW — 2)(V - 3)(N = 4) «+ Want DS a9 
m—i)(n—2n—3)m—4)---1 a 


N 
where > 3 is the sum of all the different z”s in the whole 


1 
N population. 

We shall next deal with the treble summation constituting 
the second part of the expression in Eq. (A). In order to be 
able to substitute a known value for it later, we need to ascertain 
how many times each combination of elements, ziz; will recur 
init. Within each sample there will be no duplication of paired 
terms, but successive samples will partly overlap and partly 
differ. So there will be duplications of a given paired element, 
just as in the case of the xs, but perhaps a different number. 
Let us see how many. 

Within each sample the number of different paired elements 
is the number of possible combinations of n things taken two 
atatime. If, in our basic formula (B), you will substitute 2 for 
n and n for N, you will find this number to be n(n — 1)/2.. The 
number of samples has already been given in formula (B). There- 
fore, the total number of paired items in the whole of the tail for 
all the samples combined is the product of the number of samples 
and the number of items in each sample, viz., 
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MN — 1)W -DN — 3)(N — 4) ++: Nee tins eed) 
Qn(n — 1)(n — 2)(n — 3) ` 


But the whole number of different paired items is the number 
that can be made from the whole population of N taken two at 
a time. This, as appropriate substitution in formula (B) will 
show, is N(N —1)/2. Since all the possible variates occur 
and with equal frequencies, the number of times each will occur 
is the total number divided by the number of different ones. 
That is, 
2N(N —1)(N — 2)(N — 3) +++ (N—n+1)n(n — 1) 
-ƏN(N — Inn — 1I)(m — 2)(m — 8) °° 1 

Certain of these terms cancel out, leaving as the frequency 

of occurrence of each possible different pair, 


(N — 2)(N —3) +++ (N—n+1) 
(0) m-ğm-3) -I 


Abandoning that line of development for the moment, we 
may write 


CEE E e A 1C ia 2 ia a a aia +2y) =0 
for each quantity in parenthesis sums all of our N items, and 


they aggregate zero because the measures were taken as devia- 
tions from the mean of the whole set. Multiplying out, 


a? +r Hat oeo H 2a + airy + 2r 
fess + Qrers T) 


N, N. N 
This we may write more briefly as PH +2 > 2 =0. 
1 ielj= 


NON N 

Therefore, transposing, 2 > tt; = -> x}. This double 
felj=1 1 

summation involves the value for the sum of all possible products 

of different variates taken two at a time in a population of N 


N 
in terms of > Formula (C) gave the number of times 


k l 

such systems of paired products recur in the treble summation 

of formula (A). Therefore the value of the second part in formula 
N, 


(A) is the product of the -> z? and the coefficient indicated 
1 
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in formula (C). So, substituting in formula (A) the two coeffi- 
cients thus determined, we have 


aa _ (N= 1(N = 2): cM K 

Sntok (n — 1)(n — JE Š: 
ANED V8) Atai 2 
EET ph 


Let us examine this expression closely. If we multiply the 
first of the members on the right by N/n, we shall have the 
equivalent of S, for we shall have the formula for the number of 
combinations that can be formed of N things taken n at a time. 
Similarly we shall have S as the coefficient of the second term 
if we multiply by ya We can do such multiplying if 
we indicate a compensating division or (which amounts to the 
same thing) a multiplication by the reciprocals of these terms. 
Making these adjustments, we have 


nin —1) aX 
sips Sa -nr is AA 


Dividing through by nS and taking >2?/N out of the parentheses, 
we have 


> 
z 
no = 1 ( #1) 


NET 


N, 
But N is the number of items out of which > x} is constituted, 


1 
since this has been so carried as not to include the duplicates. 
N. 


ye 


Therefore, +- is the č? of the whole N population. Making 
N 


this substitution and again dividing by n: 


er in 
©) R(t yaa) 
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Now let N increase infinitely. Then the value of the frac- 
tion in the parentheses will approach zero in value and, in the 
limit, 

g2 


6. 6. 
=) and ou = — 


(Standard error of a mean) (64) 


Notice that formula (64) has in its numerator č, the standard 
deviation of the population. Of course, we could never know @ 
and would need either to substitute for it s, an estimate of the 
population value, or merely ø, the standard deviation of the 
sample. On page 70 we showed that 


n 
n= 


Gz = Oz 


Making that substitution, we have 


nes z _ Gs Lh ee Tz (Standard error (64a) 
Vn VWnNn-1l yn=1 of a mean) 

This (n — 1) instead of n always belongs theoretically to a 
standard error of a mean. However, in educational statistics 
we customarily neglect the distinction because our n’s are so 
large that the subtraction of a 1 does not make an appreciable 
difference. While it is not worth the student’s trouble to make 
the correction in most statistical practice, he should remember 
that it always theoretically belongs in his formula and should 
employ it whenever, in sufficiently trustworthy measures, his n 
becomes small enough that the correction would make an appreci- 
able difference.! 

Effect of Restricted Selection—We wish now to direct the 
attention of the reader to the (n — 1)/(N — 1) of formula (D). 
For the formula to hold in the simple way in which we left it at 
the close of our last paragraph above, the N must be very large. 
That is not always the case. Suppose, for example, you were 
taking samples of 25 pupils each from a total set of 50 pupils. 
Here n would be half as large as N. Disregarding the 1 sub- 
tracted from each, on the ground that it makes little difference 
with numbers of reasonable size, 


1 We treat the reliability of small samples on pp. 171-176. 
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It is obvious that the limitation of the sampling here has a 
marked influence in decreasing the size of thé standard error of 
the mean. In general, if we let p represent the percentage that 
the sample is of the whole population from which the samples 
are drawn, 


(E) ou = VI =p 
Vn 

Ordinarily the research worker will not have occasion to use this 
formula as here presented, but it is interesting and important as 
generalizing a principle we shall treat in our next paragraph. 

Standard Error of a Mean in Correlated Series.—In our 
previous section we saw that restriction of the population from 
which samples are drawn operates to reduce the standard error 
of the mean, making it V/I — p times as great as it would be if 
the samples came from an unrestricted population. That is 
because the successive samples overlap one another to the extent 
to which they are crowded into a small total population. We 
have a special case of such restriction when the successive 
samples are matched with an initial one in a relation that involves 
correlation. We have seen (page 120) that correlation depends 
upon overlapping of the correlated samples, but not necessarily 
because of narrow boundaries of the total population from which 
samples are drawn. Restriction due to correlation would 
happen, for example, when a class was retested with the same 
test or with a different form of the same test. It would happen 
equally certainly if a number of groups matched with an initial 
group for ability (say, on intelligence scores) were tested with 
the same test. In both cases the successive samples would 
fluctuate less, and hence have a smaller standard error, than if 
they had not been matched with an array with which they were 
correlated. We shall undertake to develop a formula for the 
standard error of a mean under this condition of correlation. 

Suppose we have a series of x scores and another series of 
y scores, the two sets being correlated. The q scores correspond- 
ing to y scores of a given size would scatter so that their standard 
deviation would be ox/1 — 72, (see page 113). The 2’s and the 
y’s may be any sort of units, including means. So the « means 
corresponding to y means of a given size would scatter in such 
way that the measures of their variability would be, if om. is 
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the standard deviation of the means of a random selection of 
samples and om., that of a column of means of samples belonging 


to a particular level of ability as measured by the matching 


` test, 
Oma, = OmV 1 — Them 


But we have already shown in this chapter that om = 


Gs 
Vn 
and we shall shortly show (page 162) that m.m, = Tz. Sub- 
stituting these values, we have (since the 7 subscript has been 
employed merely to indicate any particular level of ability at 
which we are matching so that we may now drop it from our 
notation) 


J 


P E T E E EA se» (68) 
Vn fallible criterion) 

The r here is the coefficient of correlation between the matching 
element and the successive samples. In order to involve this 
principle, it is only necessary that the groups be matched for 
equality of means, since that will force correlation of individuals. 
However, the r could not be computed unless individuals were 
paired as well as means, though it might be known from previous 
experience with the measures. 

We have treated the case where the matching is on a fallible 
criterion—the criterion measures having a certain unreliability 
which results in our accepting certain sample groups as matched 
with the others when, truly measured, they would not belong. 
We shall later see (page 207) that the scatter of correlated 
measures about the true scores with which they should be paired 
rather than about the fallible ones with which they appear to 
be paired is measured by o2\/1 — r. The same is true when 
our correlated measures are means. Our formula, therefore, 
where the matching is on a “true” criterion, would be 


vu) E/E oer E E E E er nCO8) 
Vn infallible criterion) 

We would have such matching onfa true criterion where 
the same group was to be retested, for here the paired individuals 
are the same persons, consequently truly paired as to ability. 
The variability of the means to be expected if we should repeat- 
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edly retest the same group is, probably, what we usually have in 
mind when we think of the standard error of a mean; hence 
formula (66) is the one most often to be used. The r here is 
the reliability coefficient of the test. If we have in mind the 
variability to be expected in case we should sample successive 
groups of the same mental age or of the same social status, we 
should use formula (65). Here the r would be the coefficient of 
correlation between our: test and mental age or social status, 
either as determined by calculation in this case or as known from 
previous experience with the measures. If we have -in mind 
the fluctuation of the means of random samples from our popula- 
tion regardless of matching for equality in any factor, we should 
employ formula (64). One cannot speak with precision about 
the standard error of a mean unless he indicates whether he refers 
to a random sampling, to a sampling matched on a fallible 
criterion (as when one measures, say, the voluntary reading of 
pupils of the same average educational age), or to repeated 
testing of the same group (which involves matching on an 
infallible criterion). The use of the correct rather than an 
incorrect formula may make a vast difference. The standard 
deviation of the total score on the Stanford Achievement Test 
in the fourth grade is given by Kelley! as 10 points and the 
mean'as 32.7. A reliability coefficient of .89 is claimed in the 
test manual for this grade. By the conventional formula (64a) 
the standard error of the mean would be for a group of 37 pupils, 


10/1/36 = 1.7. By the correct formula it would be 
10 
—=v 1 — .89 = 0.55. 
/36 
This latter is just about a third of the former. 


APPLICATIONS OF THE STANDARD-ERROR CONCEPT 
Having developed our formula for determining the standard 
error of a mean, we wish now to see what is to be done with it 
in a particular research situation. One use is merely to give a 
sense of the variability of the mean in comparison with the 
absolute size of the megn, just as the size of a coefficient of cor- 


1Ketiey, T. L., Interpretation of Educational Measurements, World Book 
Company, 1927, p. 198. 
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-relation yields a sense of the closeness of relation between the 
correlated arrays. Therefore just to say that the mean is 47 
and its standard error is 4 is to some extent meaningful. But 
we can make the interpretation much more meaningful if we make 
certain assumptions about the distribution of the means of 
samples and under these assumptions draw inferences regarding 
the probability that the true mean does not lie beyond certain 
limits. 

It may be reasonably assumed that if a great many samples 
were drawn from a population and means of measurements of 
some kind were calculated from the samples, these means would 
make a normal distribution. Suppose we have given to 101 high- 
school pupils a general information test and have obtained a mean 
of 93 and a standard deviation of 20. The standard error of this 


mean [formula (64a)] would be 20/+/100, which equals 2. If we 
were to repeat the test with many other groups of high-school 
pupils, the means would fluctuate in such manner as is indicated 
in the accompanying bell-shaped curve. These means would 
gather around the true mean at the center of the distribution. 
What this true mean is we do not know. Most writers on 
elementary statistics assume that it is the obtained mean that 
lies at the center of the distribution. This is a wholly unwar- 
ranted assumption and an unnecessary one. We shall not fall 

into that blunder, although to do so 

would make our explanation sim- 

pler. Let us place our obtained 

mean off somewhere from the cen- 

ter of the distribution, say at OM. 

It is possible that the true mean 
may lie, say, as high as 95, which 
is one cx above 93. For, if the 
true mean were 95, we would still get a mean as low as 93 
sometimes; in the whole distribution of samples, means of 93 
or less would be obtained in all that proportion of the cases 
lying in the tail of the normal distribution below point OM. 
The ordinate OM lies 1c away from the mean, and reference to 
our table of integrals of the normal curye in the Appendix will 
show that, when z/s- = 1, the percentage of cases between 
the mean and the ordinate is 34.13, and hence the percentage 
in the tail is (60 — 34.18) = 15.87. Hence we would be among 


OM TM OM’ 
Fia, 15. 
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15.87 per cent of the cases if our sample mean lay at one standard 
error below the true mean. But, if this is so, something has 
happened to us that would happen only 15.87 times out of 100. 
The odds against that are about 5.3 to 1 (34.13 + 50 divided by 
15.87). Hence we infer that the true mean probably does not 
lie as far above ours as we hypothetically assumed; the chances 
are only 15.87 in 100 that it does. In a similar manner we can 
test the probability of our having obtained the mean we did if 
the true mean lay as low as 91, which is one standard error below 
ours. The chances that we would have obtained our mean of 
93, if the true mean is only 91, are again only 15.87 in 100, as the 
proportion above ordinate O'M” indicates; hence the chances 
are 15.87 in 100 that the true mean does lie as low as 91. Putting 
these cases together, we may say that the chances are 15.87 
in 100 that the true mean lies at 91 or below and also 15.87 in 100 
that it lies at 95 or above; but that, conversely, the chances are 
68.26 in 100 (about 2 to 1) that the true mean lies between 91 
and 95. We may make similar hypotheses about our chances 
of having obtained the mean we did if it lies two ou’s above or 
two below the true mean and may find that the chances are 
95.44 in 100 that the true mean lies between plus and minus two 
standard errors of the obtained one, viz., between 97 and 89. 
Or we can make our limits three or four standard errors, or 
fractional parts of these measures. Or we can make computa- 
tions of the probability that the true mean does not lie beyond a 
standard errors above or beyond b standard errors below the 
obtained one. We can also make these interpretations in terms 
of P.E.’s instead of o’s, either by employing tables made up in 
terms of P.E. units or by remembering that P.E. equals 0.67450 
and working with tables in terms of ø accordingly. The interpre- 
tation in terms of P.E. is particularly simple because the chances 
are 50-50, or the odds one to one, that the true mean does not 
lie more than one P.E. above or below the obtained one. This 
same type of interpretation holds for all other measures of 
reliability. 


Fiovcrau Limits 


The above discussion of the limits between which the true 
mean or other statistic (called the parameter or the population 
parameter) may, with a given degree of confidence, be expected 
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to fall was conducted in quite untechnical terms. In recent years 
this principle has been dealt with in a much more straightforward 
but highly technical manner by R. A. Fisher and others under the 
term fiducial limits. But, because of the desirability of covering 
the case of small samples as well as large samples, the standard 
of confidence is put in terms of probability of correctness rather 
than in terms of abscissa values, as we put it above. We shall 
later see that the proportion of the area of the distribution of ¢’s 
(standard-error units) lying between certain ¢ values is some- 
what dependent upon the size of the samples, hence must be put 
in terms of both ¢ and n; and this same thing is, consequently, 
true of the probability that the parameter lies between the values 
corresponding to these points. The selection of the fiducial 
(confidence) limits is, of course, arbitrary, but 95 per cent is 
customarily taken as an acceptable fiducial probability for 
satisfactory significance and 99 per cent for high significance. 
The former standard means that 95 per cent of the estimates of 
the parameter made from an infinite supply of samples lie 
between ordinates OM and O'M” in our figure, page 136, while 
5 per cent (24 in each tail) lie outside. If the sample is large 
so that the distribution may be considered normal, an inspection 
of our table of the normal curve function on page 484 will show 
that the ¢ corresponding to 0.025 in the tail is plus or minus 1.96. 
In the latter case, with 0.005 in the tail, the fis 2.5758. So in our 
example with a large population (100), a mean of 93, and a stand- 
ard error of the mean of 2, the chances are 95 in 100 that the 
true mean lies between 93 + (2)(1.96), or between 89.08 and 
96.92. Correspondingly, the chances are 99 in 100 that the true 
mean lies between 93 + (2)(2.5758), or between 87.85 and 98.15. 
If the sample had been small, say 12 individuals, we would go 
to Fisher’s t table, page 173, enter it with n = N — 1 = 11, and 
find along that row in the column headed .05 the ¢ = 2.201, 
and in the column headed 0.01 the ¢ = 3.106. Then the cor- 
responding fiducial limits for a fiducial probability of 95 per 
cent would be 93 + (2)(2.201) = 93 + 4.402; and for a 99 per 
cent fiducial probability, 93 + (2)(3.106). Thus the “‘confidence 
belt” would extend between 88.598 and 97.402 in the former case 
and between. 86.788 and 99.212 in the latter. We could claim 
that the true mean would be found somewhere within the former 
range with the chances 95 in 100 that our claim would be correct 
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or that it would fall somewhere within the latter range with the 
chances 99 in 100 that the claim would be correct. 

This interpretation has been applied to means. A correspond- 
ing interpretation is applicable to variability measures, to r’s, 
to proportions, or to any other statistics where we can know the 
form of the distribution of the estimates of the parameter made 
from samples. It is also applicable to differences between 
statistics, or to sums of statistics. 

It would be beyond the scope of this book to go into a technical 
treatment of this issue. The reader who is interested may 
pursue it in the monographic literature. He should begin with 
Fisher’s initial article, “Inverse Probability,” Proceedings of the 
Cambridge Philosophical Society, Vol. 26, pages 528 to 535 (1930), 
which he will not find very difficult. Another general expository 
article is by S. S. Wilks, “Fiducial Distributions in Fiducial 
Inference,” Annals of Mathematical Statistics, Vol. 9, pages 
272 to 280 (1938). For a technical discussion of the case of 
large samples see Wilks, “Shortest Average Confidence Intervals , 
from Large Samples,” Annals of Mathematical Statistics, Vol. 9, 
pages 166 to 175 (1938); and for a thorough discussion of the 
general case, see J. Neyman, “Outline of a Theory of Statistical 
Estimation Based on the Classical Theory of Probability,” 
Transactions of the Royal Society of London, Philosophical, Series 
A, Vol. 236, pages 333 to 380 (1937). 


THE STANDARD ERROR OF A STANDARD DEVIATION 
We shall develop first a formula for the standard error of 
s?, an estimate of the population variance, and then pass to 
the standard error of s and of ø. By definition of a standard 
deviation 
a= 
n 
when the values are taken as deviations from the mean of the 
sample, the summation is through the sample and is the num- 
ber of individuals in the sample. If we conceive the 2’s as 
deviations from the mean of the whole population, we shall have 
$2 if the summation runs through the sample, or õ?, the theoreti- 


1Trwin proves that E(a, — m)?/n' gives an unbiased estimate of the 
population variance, where 2; is a variable in sample r and m is the grand 
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cal standard deviation of the population, if the summation covers 
the whole population. 

Remember that č? is the mean of all the s”s. Let the devia- 
tions of the several s”’s from this ? be represented by di, dz, ds, 
etc. Then dı = s$? — 6?; and squaring, d? = (s? — @?)?. But 

= 22?/n, where the summation is over the sample. Sub- 
stituting accordingly, 


d= ra #) = aS"): 
n n 


(eee ey 


n n n n 


In order to carry along less cumbersome notation, we shall 
represent the quantities in parentheses by wa, w, etc. Then 


aa = Wack o t we + seein) 


n? 


We shall have similar expressions for də, ds etc., involving 
each of the other samples of the whole set.of S. If we sum for 
all these squared deviations and divide by S, we shall have 


Zd? _ 220? + 222w,0;) , 
TA Sn? 


But this is precisely similar in form to Eq. (A) in our develop- 
ment of the formula for the standard error of a mean. It will 
simplify in precisely the same manner, so that we arrive at an 
equation parallel to Eq. (D), 


AS 22u’ TeL 

na ENA N-1 
where N is the whole population from which the samples are 
drawn, Substituting for w the value for which we let it stand 


and for (n — 1)/(N — 1) p in the same sense as used in our 
development of the formula for the sigma of a mean, 


Zz (a? — g’)? i > 222s? | Xi 
p= BE y= a y H A p) 


mean. That is, the sum of the squared deviations from the population 
mean divided by the number of items in the sample gives s?, J. Roy. 
Statistical Soc., Vol. 94, p. 286. 


jor Sn’o?, = B(Dw? + 2D Dw,w;) 
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Multiplying numerator and denominator of the first quantity 
by «4, remembering that 22?/N is 6? (since the x”s have been 
summed for the whole population with no duplicates) and that. 
in summing, #4 was taken N times so that =o would equal No*, we 
have 

4 
ahs 1 (241.94 — asai) a-p) 
But rt/Nő* is Bs. Therefore ch = (1/n)(B:5* — o*)(1 — p). 
In a normal distribution fz equals 3. We may safely assume 
normality here, since the distribution to which the $2 refers is not 
that of one of the samples but that of the very large total popula- 
tion, N. Substituting 3 for 62, we have 


(E) à = 4 (Bot — 991-9) =Z a- p 


The p is zero for random samples from an infinite universe. So, 
eee eee (Standard error of estimates (gy 
Oe ID aS N an of the population variance) (67) 


Since, whatever the nature of the scores represented by « 
and whatever the multiplier represented by a, ou: = Aoz and 


j n—'i A 
oł, = a?o? (see page 77) and since a? = (e 7 ) s, and since 


3=4, 
n= , n—1\* 264 n—1\?_ 244 n \ 
orga ou = . = = bipi 
n n n n nm \n = 1 
264 ee (Standard error of the 
on = at Can Oi NE sample variance) (67a) 


Note that ¢ means the average sample value, not the population 
value. Ordinarily we must substitute for it the o of the sample 
we have in hand. For the @ of formula (67) we must substitute 
the s computed from the sample in hand by dividing the sum of 
squares by (n — 1) instead of by n, as explained on page 70. 

The above paragraphs gave us the standard errors of s$? and 
of 2. We need also the standard errors of s and of ø. These do 
not come through quite so smoothly; in any simple form our 
formulas must be approximations.’ 


1 T, Kondo presents (Biometrika, Vol. 22, pp. 36-64) a thorough study of 
the sampling variance of a, with different assumptions and degrees of 
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Let us set up an identity, then put it through certain algebraic 
transformations. s; shall represent the estimate of the popula- 
tion variability made from any one sample. 


fo a 
8? = a? + s} — 6? = o? Loy 
= a( saN 
4 a5) 
Expanding the expression on the right by the binomial theorem, 
i s? — o? 1/32 — aN? 
sejra) 325") 
A E ia ahs: | 
TS 


Since (s? — #*)/s* has usually a value less than 1, the terms 
beyond the second will be small in value compared with the first 
two. We may, therefore, take, approximately, 


This expression contains a multiple of s? and a constant addend. 
Since the s’s are regarded as variates of which we are to estimate 
the standard deviation, we have a situation similar to azb = oz 
and o3,,, = a3 (see page 77). Applying this principle, 


1 


a= pie 


Substituting for ož: the value given in formula (67), 


1 /2ok ga og 
d ==- = i T, = ee (Standard error of s) (67b) 


Since ¢ = sx/(n — 1)/n, the standard error of « will be that 


approximation. The outcomes are complicated formulas which differ 
somewhat in the values they give according to the assumptions. Fisher 
and some others give formula (67) with (N — 1) instead of N as the denomi- 
nator. But we cannot find any approximation in our derivation which if 
corrected would lead to this (N — 1). If that denominator is (N — 1), the 
numerator in our cg formulas would be the population value instead of the 
average sample value. 
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multiple of the standard error of s, and o%, will be the square of 
that multiple of o3. Therefore, for random samples, 


n=l # g? g 
PE ae les o = 
n o m 2n’ a/2n 


Note that ¢ is the average standard deviation from samples, not é. 

It will be where we are comparing variabilities in small samples 
of unequal sizes that we shall need cs instead of ss, because when 
samples are very small and unequal in size, or at least when one 
of them is small, the sample standard deviations are likely to 
give a false impression of the true relation, as explained on 
pages 69 to 71. 

We have been considering the special case of random samples. 
In the more general case, involved in Eq. (F) above, the factor 
(1 — p) isincluded. That belongs here as well. The p has here 
the same force as in our discussion on the standard error of the 
mean. It is the percentage our sample is of the whole population, 
hence also the percentage of overlapping from sample to sample. 
If the successive samples are correlated with an initial array 
with which they are matched, the p represents the coefficient 
of correlation between this array and each of the samples, so 
that we have the following formulas for the standard error of a 
standard deviation: 


z Gtendard error pi a pennan derie 
AYT tion when samples are matchec wit! 
a/2N 1 r a true criterion, as where we take (68) 
repeated tests of the same group) 
(Standard error of a standard devia- 
z ee aap samples are eee on 
a/1— r} 2 fallible criterion, as where we 
e /2N 1—? sneasure the moral judgment of (68a) 
pupils of the same average educa- 
tional age) 
(Standard error of a standard deviation in case 
Cee of random samples from an unrestricted popu- (68b) 
4 a/2N lation, where consequently no correlation is 
present) 


STANDARD ERROR OF A FREQUENCY AND OF A PROPORTION 

In our chapter on the normal curve it will be shown (page 298) 
that the standard deviation of a point binomial is ~/npq. In 
this formula, p is the probability of “success” in the case of 
any one “event,” and g is the probability of “failure” while 
nis the number of “events” and therefore the exponent of the 
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point binomial. To speak concretely, if n is the number of 
pennies tossed, p the probability of a “head” (of success) in 
each penny, and gq the probability of a tail (of failure), then 
(p + 9)” gives the distribution of successes and y/npq is the 
standard deviation of the distribution of the successes or of the 
failures. With ordinary pennies p = q = 4, but with weighted 
coins p and q might have different values; but always p + q 
would be 1. 

Now it makes no difference to the nature of the distribution 
or to its standard deviation whether the n pennies are tossed at 
one throw or whether they are tossed one at a time and considered 
insets of n each. It is this latter sort of case we would have if we 
entered one after another 100 homes as a sample to determine 
how many of them have telephones, if we measured as a sample 
100 children to ascertain what proportion have intelligence 
quotients between 40 and 70, or if we investigated proportions 
in any sort of sample at all. If we think of ourselves as drawing 
such samples one after another, we have precisely the same 
sort of situation as if we toss a set of n coins repeatedly whether 
all at once or one at atime. Hence, if we know the probability 
of success in the case of particular individuals, we can foretell 
what standard deviation to expect if we were to continue drawing 
until we had a large distribution of samples; it would be +/npq 
where the n is the number constituting a set, which is the sort 
of application we are in the habit of calling the population of 
our sample. Thus to draw sample after sample is the same 
thing in principle as tossing a set of n coins many times in succes- 
sion. In such application we seldom actually draw a large 
number of samples and compute the actual standard deviation 
but, instead, infer it from what we know of the properties of the 
distribution that generalizes the point binomial; hence we speak 
of the standard error of sampling rather than of the standard 
deviation of a binomial distribution. 

But how can we know the p and the q? We assume them to 
be the ones indicated by the behavior of our sample in hand. 
That is, we take it that we have in hand the most probable com- 
bination of successes and failures and use this combination to 
define the p and the q; if 25 per cent of the homes in our sample 
have telephones and 75 per cent do not, we assume for purposes 
of our formula that p = } and q =Ẹ. This assumption is a 
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precarious one, since we may have happened upon a sample that 
involves a marked deviation from the modal one; but we can do 
no better than to accept it on faith as representative. 

Success may be defined in any way that fits our purpose: 
having a telephone, brushing the teeth at least once a day, 
falling between score 45 and score 64 on a certain geography 
test, standing above score 90 on an intelligence test, or whatever 
else we please. We can, thus, very appropriately define success 
as coming within any category in a distribution and failure as 
lying outside such category; and hence the standard error of a 
frequency in any category is given by the formula 


TA (Standard error of a frequency in any 
os = V Npg category in a binomial distribution) (69) 


Here N is the total population of the sample (including both the 
cases within and those without the category under consideration), 
p is the probability of being in the category as indicated by the 
proportion that is in it in the sample, and q equals (1 — p).° 

If we divide the frequency by the total number, we shall, of 
course, have the proportion of individuals in the category. 


We have already shown (page 77) that oz = ioe Applying 


this principle, but remembering that we must square our divisor 
when we place it under the radical sign, we have 


N Standard f 
o= W VN apoda 20) 


THE STANDARD ERROR OF A PERCENTILE 
Let P be the true point designating a percentile in which 
we are interested, t.e., the point 
at which this percentile would i 


be located in the average of a INP 
very large number of samples iNo 
froma population. Ina partic- i ! 
ular sample the obtained per- P, P Pp 


centile might go up to Ps, which rte 

is a distance Ap above P, or 

down to P, a distance of Ap below. If we take the figure PP2P»P 
to be for practical purposes a rectangle and denote its height by y, 
its area (f) would be yâr. We would have, therefore, yâr = f, 
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or Ap=f/y. Squaring, Aż = (f?/y?). Summing for all the 
samples and dividing by the number of samples, 
za? _ 2/8 


5 y? 


2 
1 Or op, = zi 

But o7 is the same as the o? of the area of the tail of the dis- 
tribution, p; for the tail is bounded by the line PP so that the 
positive and negative increments of the tail are identical with 
those that constitute f. We have just shown that the standard 
error of a frequency, and hence of the tail p and consequently 
of f, is 


os = V Npg, whence o} = Npq 


where N is the total population of the sample, p is the proportion 
of cases in the category under consideration (here the one tail), 
and q is the proportion of cases outside the category (in this 
case the other tail). 

Substituting this value in the equation above, we have 


2 _ Nog (One form of the standard (71) 
Les y? A error of a percentile) 


This is the formula we need, but we desire a simpler value 
for the y?. The y is the ordinate of a normal distribution at 
the position P. In our treatment of the normal curve we show 
that 


We may separate this into two factors as follows: 


ae 

o \V 2r 

The part in parentheses is z for which numerical equivalents 
are given in Table XLIV in the Appendix, since z is defined as 


the ordinate of a normal distribution in which N = 1. There- 
fore 
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Substituting this value of y? in formula (71), and simplifying 
the formula, we have 


2 Npa _ @Npg _ pq 
r, ~ Nz N ZN 
wherefore 
o q (Standard error of a percentile 
CP z NN in a normal distribution) (72) 


The resultant standard error of the percentile is in the same 
units as the øg; if the ø is put in terms of score points the standard 
error is in terms of score points. 

As an example, take a distribution of 100 cases with a standard 
deviation of 12, and find the standard error of the 20th percentile. 
Looking in Table XLIII, we find that, when p = .20, z = 0.2800. 
Substituting the known values in our formula we have 


LTR [C80)(20) AE 
Tru = 9.080 V 100 0.280 \10 ~ 


Certain percentiles are employed so frequently that it will 
be worth while to determine here their standard errors. One 
of these is the median. The median is the 50th percentile. 
In order to determine its standard error we need only make both 
p and q equal .50, determine the z from our table at the point 
where p = .50 (which is 0.3989), and solve for oman. 


o (.50) (.50) o 
oman = 03989 N = 1.253 VN 
(Standard error of a median) (78) 


Since, when working with medians instead of means, we usually 
have point measures in hand rather than moment measures, 
this formula will customarily be more convenient when put in 
terms of probable error and of the quartile deviation of the dis- 
tribution. In a normal distribution Q is 0.6745 times ø, and 
similarly P.E. is 0.6745 times the standard error. Therefore, 
multiplying both sides of our equation by 0.6745 and substituting 
the equivalent symbols, we have 


1.258Q 
VN 


P.E.ma = (Probable error of a median) (73a) 
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The standard error of a quarter point is found by a similar 
procedure as follows: 

F o (75) (25) _ 1.86267 _ 2Q 

GQ, OF Gas 0.3178 N VN s/N 


(Standard error of a quarter point) (74) 


(nearly) 


Multiplying both sides of the equation by 0.6745, as above: 


1.36261 
PRa or Pn = Se CHRR" (740) 


By the same procedure 


— 1.7090 Probabl f th 
P.Hiry oF P.Ena =DE robale sor ofthe (740) 


THE STANDARD ERROR OF INTERPOINT RANGES 

We often wish to state our variabilities in terms of the range 
between two points, as the interquartile range or the range 
between the first and the ninth decile points which Kelley has 
called D. We can compute a standard error for such ranges as 
follows, considering our statistics to be in the form of deviations 
from the means of the whole sets of these statistics. We shall 
let P, and Pz represent the two percentile points between which 
we are considering the range. 


a I LE 2 By z _ 22PP, 
P-P; ~ S 5 


If we recognize the first term as x and the second as cf and 
multiply the third by (orer,) /(or.cr:), we Shall have 


yaa BPP. 


2 +o 
op, —, = 02 ? 
PsP, P; Sorar PO Py 


This last term contains an expression for rp2>. When we sub- 
stitute this we shall have 


(General formula for the 
dkp, = of, +o}, — 2rp.ror.cr, standard error of inter- (75) 
percentile ranges) 
We encounter here the need for a formula for the r between 
percentiles. The following development of such formula follows 
in the main the development by Yule. 
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Let Py, and Pp, be two percentile points, the former marking 
off a tail of pı in the distribution and the latter a tail of po. If 
we make here the same assumption that we did in connection with 
the development of a formula for the standard error of the 
percentile (see page 145), viz. that through the small distance 
through which the percentile fluctuates from sample to sample 
the curve may be considered 
substantially flat, the deviations 
in the lower percentile are di- 
rectly proportional to those of 


the area of the tail of the distri- Pi P2 

bution (pı), while those of the P. P. 

upper percentile are proportional iR e P2 
IG. . 


to pz but of opposite sign. We 
may, therefore, take the correlation between the percentiles to be 
the same numerically as that between the areas pı and pz but of 
opposite sign. Our problem, then, is to get a value for the r 
between the proportions pı and pz. 

If there is a deficiency of observations below the lower per- 
centile, so that in the sample in question the area below Py, 
differs from the true proportion by 61, that deficiency may be 
expected to be offset by a surplus above apportioned to the other 
categories in proportion to their respective sizes. Thus pz will 
have a positive increment 62 of such size that 


ô: = —— ô 


where qi is (1 — pı) and is, therefore, the whole area within 
which the 5, increment is to be apportioned. Thus —(p2/q1) is 
the regression coefficient of 52 on ô. The coefficient of correla- 
tion is (see page 111) the regression coefficient multiplied by 
the ratio of the standard deviations of the two series. Therefore 


pr n _ _p2 V (pi9:/N) 

qı Or, qı VW (poq2/N) A 

= a a foe Sas nthe same distr: (76) 
Q192 


13,3, T: Tp, > 


bution) 


Since, as said above, the r between percentiles is the same 
numerically as that between the tails marked off by them but 


of opposite sign, 
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—_ {Pips (Coefficient of correlation between per- 

T Pp Po, T EA centiles in the same distribution (77) 
Ordinarily we shall be concerned only with ranges between 
symmetrically placed points, t.e., between points equally 
distant from their respective ends of the distribution. In this 
case pı = p2and qi = qz, so that r = p/q. Furthermore, in this 
case the standard errors of the two percentiles are equal, so that 


Gpp, OF OP, = z T We may then simplify our formula for 


the standard error of a range as follows: 


aen = Od, + Oh, — Drep. T POP, = 20p,(l — TPp,Py,) 
(Standard error of the range 

oreen = OnI = Fenn) emer, a S (78) 
With this formula let us find the standard error of Q, the semi- 
interquartile range, often called the quartile deviation. (Notice 
that Q is very different from Qi.) Q is half the range between 
Qı and Qs, that is, half the range between the 25th and the 75th 
percentiles. Sigma of the 25th and also of the 75th percentile 


is z oe The r between these two percentiles is .25/.75, 
which equals .333. From our tables in the Appendix z is found 
to be 0.3178. Substituting these values, we get 


oe COD. a ase 
50,0, EN yn V20 = 3333) 


_ 1.57340 
EVAN 


Since Q is half of the interquartile range its standard error will 
be half as large. Therefore, dividing by 2, we get 


0.7867 1.166Q (Standard d probabl 
og = j T Q erok of E E E (79) 


A/N i VN quartile range) 
0.53060 is 0.7867Q 

VN VN 

D was defined above as the range’ between the 1st and the 9th 
decile points, i.e., between the 10th and the 90th percentiles. 


P.B.g = 
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The substitutions required in the formula for getting its standard 
error are as follows: 


ov (10) (90) 2.27927 _ 3.38 
e ON VŒ — .1111) = 


P.E.p = 1.540 _ 2. 2.28Q (Probable error of the 10th (80) 


«/N sN to 90th percentile range) 


We shall give without proof the value shown by Kelley for 
the standard error of an average deviation: 


0.60 _ 0.9Q Standard and probabl 
cap = OF OTE lana od pale (BH) 
0.40660 _ 0.6Q 
P.E.an. = = oe 
VN VN 


If we put the standard error of each of the measures of vari- 
ability in terms of the magnitude of the variability measure 
itself, we can get a clear idea as to which measure is most stable 
and therefore most desirable when other considerations are equal. 
In doing this we shall need to know the equivalents between e 
and each of the other measures of variability. Q equals, of 
course, 0.67450. On page 81 it is shown that A.D., in a normal 
distribution, equals 0.7979. The relation between D and e 
is found as follows. By referring to Table XLIII in the Appendix 
we find that when g = .10, x = 1.2816. The 2’s are in terms of 
o’s of a normal distribution of unit area and unit standard devi- 
ation. Thus from the 10th percentile to the mean is 1.2816¢. 
From the 10th percentile to the 90th is twice as far. Thus D, 
the whole range from the 10th to the 90th percentile, is 2.56320. 
Therefore, replacing o’s by their equivalents in other statistics, 
we have 
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It is thus shown that the standard deviation is the most reliable 
of the customarily employed variability measures, the average 
deviation next, then D and last of all Q. Of course, all these 
computations turn upon the assumption of a normal distribution. 
In practice with small distributions these relations may not 
hold in precisely the same way. 


THE STANDARD ERROR OF A COEFFICIENT OF CORRELATION 


In the lithoprinted edition of this book we gave a derivation 
of the formula for the standard error of r. But the derivation 
was rather lengthy; consequently, limitation of space compels 
us to omit.it from this edition and to refer to the earlier edition 
those readers who are interested in following the proof. The 
formula derived there is the following customary one: 


a 
Or = 1 Ti (Standard error of r) (82) 
P.E., = 0.6745 Lise (The probable error of r) (82a) 


VN 


The derivation involves the following assumptions: recti- 
linearity of regression in both arrays; homoscedasticity in both 
arrays; and mesokurtosis (82 = 3) in the sample in both arrays. 
These assumptions are rather hazardous, and the distortion is 
made far worse by the fact that the distribution of r’s from 
random samples is highly skew for arithmetically large 7s. 
It is however satisfactory for large samples and for small or 
moderate sized r’s. 

Soper! has shown that, by avoiding certain assumptions and 
approximations in the above type of development, a somewhat 
better approximation can be made to the standard error of a 
coefficient of correlation as follows: 


1—p? [ Tip? ] (Second approxima- 
o = | 1 He tion to the stand- (83 
VERA eet ie 


where p is the true correlation (that from the whole population). 
Note that this p has nothing to do with ranks correlation, for 
which it is customary to employ the same symbol. In practice 


1 Sopmr, H. E., “On the Probable Error of a Coefficient of Correlation to 
a Second Approximation,” Biometrika, Vol. 9, pp. 91-115 (1913). 
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we would not know p and would need either to estimate it by the 
rather complicated formula given below or substitute r for it. 
We show below that, in samples of reasonable size, r is, on the 
average, a close approximation to p. 

But what we most often need in practice is not the standard 
error of the r computed from our sample but the standard error 
of r for samples of the same size when the true r is zero. For 
most often we wish to know whether we could reasonably expect 
to have obtained as large r as we did if the true r is zero. The 
formula for the standard error when the true r is zero follows 
readily from the formula quoted above from Soper; it is 


1 (Standard error of r where (84) 


Cr: s/N Te the true r is zero) 


In this case the distribution of random samples is symmetrical 
about zero and may be regarded as normal except for very small 
samples. This formula should be used much more than is now 
the case. It is always the pertinent one when what we wish to 
show is that our obtained r differs reliably from zero. 

rs from Small Samples.—There is always a slight bias in r 
computed from a sample less than the whole population. Fisher! 
gives the relation between an r from a sample and the “most 
likely” population value as follows: 


zyr? = hpt (Most likely pop- 
ĝ=r— SiN = [: = N = ulation equiva- (85) 


lent of an r) 


This correction would be very small with more than 25 pairs 
of observations and wholly negligible with N’s of 50 or more. 
It always reduces slightly the arithmetic value of r, except when 
r= +1.00 or when r= 0at_which points =r. But the 
correction is always far less than the probable error. 

It is also true that the distribution of correlation coefficients 
in random samples around a true r is not normal. Of course, 
that distribution is skew for samples of all sizes as p departs from 
zero; but in addition the distribution of ¢ (the ratio of r to its 
standard error estimated from a sample) is leptokurtic, so that 
the use of the normal curve values gives somewhat erroneous 


1 Wisner, R. A., “On the Probable Error of a Coefficient of Correlation 
Deduced from a Small Sample,” Metron, Vol. 1, No. 4., p..9 (1921). 
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interpretations of the reliability, especially in small samples. 
According to Fisher, ¢ obtained by the following expression is 
distributed in the same manner as Student’s ratio on the hypoth- 
esis that the true r is zero: 


r 
EEND) 
= [rW—2) (Student’s ¢ for the reliability of r (86) 
pS when estimated from the sample) 


Fisher does not offer the denominator in the middle expression 
above as a formula for the standard error of 7; he merely shows 
that the best estimate that can be made from a single sample 
of the probability of obtaining by chance an r of the size of 
the one in hand (or larger) if the true correlation is zero is given 
by the ¢ determined by formula (86) when used with Student’s 
distribution. The table, which we give on pages 488 to 492, is 
to be entered with n = (N — 2), and Table: XLVI must be 
used to supplement Table XLV if precise probabilities are 
wanted and n exceeds 20. But for reasonably large n’s the nor- 
mal curve tables give good enough estimates except for very high 
odds (very low probabilities). On the average, formula (84) will 
give practically the same values as formula (86) except where N 
is as small as 25 or less, and the former is much easier to use since 
it is independent of r and the reliability of a whole column of r’s 
from the same population can be indicated once for all since only 
the N is involved. 

Fisher makes much of handling small samples, hence his 
emphasis on such formulas as (86) which differ in outcome 
appreciably from the classical ones only in the case of small 
samples. He characteristically carries the computation of an 
r to four decimal places, even when computed from 10 to 20 
pairs of observations and when the standard error is of the order 
of .20. Then, because he employs formula (86) instead of (82) 
for the ¢, he speaks of his method as “exact.” But the research 
worker must not be misled into thinking that any such legerde- 
main permits him to calculate coefficients of correlation from 10 
or 20 pairs of observations and have meaningful results just 
because he employs some trick “correction” formula. No sta- 
tistic is more dependable than the observations upon which it is 
based, for any correction formula makes the obtained statistic 
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its starting point. Correlations from 10 pairs of observations 
are practically useless regardless of any “correction” formulas, 
unless they are from extremely stable variates such as means of 
large classes. 


TRANSFORMING r INTO 7 


In view of the fact that the distribution of r is limited to +1, 
random samples for a given p become highly skew at the two ends. 
Furthermore, it becomes harder and harder to raise an r through 
successive equal units as perfect correlation is approached, so 
that a difference of (say) .10 means far more near the upper or 
lower limit than it means near zero. To remedy this, Fisher has 
proposed that, for certain computational and comparison pur- 
poses, we use instead of r the hyperbolic arctangent of r, which 
he calls z but which, following Tippett, we shall designate z’ 
because it is not exactly the same as the z employed for testing 
significance in analysis of variance. 


tanh-! r = z’ = 4flog. (1 + 1) 

— log. (1 ~2)] nating into g 7) 
[logio (1 + r) — logis (1 — 7)] 

z' can take values from zero to infinity and can take either the 
plus or the minus sign. Fisher derives? for the distribution of 
random samples of 2’ the following measures of skewness and of 
kurtosis: 


_ 2.3026 
Tae 


ET ne ONES Hata 
bı = ee a) ar 

n 82 — Bp! _, 128 + 112p? — 57pt — 9p! 
A= 3+ tear — 1) + 32(N = 1)? 


Thus f:, although depending somewhat upon p, is nearly equal 
to zero so that the distribution is nearly symmetrical; and 62, 
while also somewhat dependent upon p, is nearly equal to 3. 
Because for a normal distribution 6: equals 0 and 6» equals 3, 
Fisher claims that the distribution of random samples of A 
is “nearly normal.” In his Statistical Methods for Research 


1 Tbid., p. 14. 
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Workers, Fisher gives for the standard error of z” 


1 (Approximate formula for the (88) 


igs N-3 standard error of z’) 


which is independent of z’ and consequently of r. But that is 
only an approximation. The more complete formula is! 


ae 
sy SAE — 21p‘ 
48(N — 1)* 


Thus the standard error of z’ is somewhat dependent upon the 
z’, since z’ is a function of p. E. S. Pearson and his associates 
have made several empirical studies of the distribution of 2’ 
and its standard deviation? and find that the full formula agrees 
rather closely with the actual distribution and that the approxi- 
mate formula does moderately well. 

The mean departure of the average z’ from the true 2’ is not 
zero; the 2’ has a slight bias as follows: 


a a Ttup fee) 
r-r- gypit] 6m 


In precise work with z’ this requires that a small Ba be 


b 
eB iai ] (Standard error of z') (88a) 


i H Bee ; 
made to the obtained z’, which is approximately NT 1)’ 


and which must be subtracted arithmetically from the z’ obtained 
by formula (87). ’ 

There are some advantages to the use of z’. 

1. The fact that z's in random samples are distributed almost 
normally along the whole range makes the interpretation of the 
standard error more meaningful and legitimate. Regardless of 
the size of the sample, the interpretation may be made in eas 
of the normal curve. 

2. Unit increments of z’ have nearly the same meaning (in 
terms of difficulty of attaining them) all along their range, while 


1 Tbid., pp. 13-14. 

2 Pearson, Econ S., “Further Experiments on the Sampling Distribu- 
tion of the Correlation Coefficient,’ J. Amer. Statistical Assoc., Vol. 27, 
pp. 121-128; see also Biometrika, Vol. 21, pp. 257f. 
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rs do not. This fact makes adding, subtracting, or averaging 
z’s more legitimate processes than are like processes with 7’s. 

3. If one is insistent upon showing how sample 7’s would 
spread at the level of his obtained r rather than around a true r 
of zero, the only really correct way of showing it is by translating 
his r to z’ and interpreting the 2’ in terms of the oy. For the 
distribution of 7’s around any other point than zero is neither 
normal nor symmetrical. 

On the other hand 7’ has limitations of which cognizance must 
be taken. : 

1. Its reliability formulas (when in usable form) are approxi- 
mations, just as are those of r. 

2. z' is only an intermediate statistic; the final result must 
be in terms of r. For z’ has only an artificial meaning while r 
has a straightforward and practical meaning: viz., the slope of 
the best-fitting line when the variabilities have been equalized. 

3. The advantage from the standpoint of standard error is 
academic rather than practical. The main situation in which 
we are concerned about the standard error is when we wish to 
know whether the r we have in hand might have arisen by chance 
when the true r is zero. But samples of r when p is zero are 
distributed as symmetrically as z’ is, and nearly as normally, 
and the formula for the standard error is also independent of r 
[formula (84)]. Since r at this point has all the advantages of z’, 
the awkwardness of the transformation may be avoided. 

4. It is legitimate to add or to subtract 2’’s only when we can 
assume that they are estimates from the same population. It 
would not do, for example, to average 2’’s for the correlation 
between intelligence tests and academic achievement when the 
intelligence was measured by different tests and on somewhat 
different types of students. When needing a central tendency 
for a number of 7’s it would be much better to take the median, 
and the median z’ would correspond exactly to the median r, so 
that nothing whatever would be gained by the transmutation. 

5. If we add or subtract zs in correlated samples and test 
the significance of the sum or the difference, we shall lose accuracy 
by reason of not knowing the tail of the formula involving correla- 
tion (see pages 160 to 162). For the r between 2s is not known. 
The loss from this cause might well be greater than any gain 
from using 7’ instead of r. 
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ADDITIONAL RELIABILITY FORMULAS 


For the standard error of $: or Bs, see H. L. Reitz, Handbook of 
Mathematical Statistics, page 96, or Karl Pearson, Tables for 
Statisticians and Biometricians, Tables 37 and 38. For many 
additional standard-error formulas, see Kurtz and Dunlap, 
Handbook of Statistical Nomographs, Tables, and Formulas, pages 
103 to 140. 


Exercises 


1. Find the P.E. of the mean in Table I, page 48, and in Table II, page 
45. Interpret these statistics. 

2. Find the standard error of the medians for these same two distributions, 
and compare them with the standard errors of the means. 

3. Find the value of the 10th percentile in Table I, and compute its 
standard error. 

4. The norm for the seventh grade in a certain arithmetic test is 48. Ifa 
typical sample of 36 of your pupils makes a mean of 45 and a standard 
deviation of 12, what are the odds that the true mean for your school is up 
to norm? 

6. What are the odds that the true mean for this particular sample is up 
to norm, if the reliability of the test is 0.89? 

6. From how large a population must an r of .20 have been computed if it is 
to be as much as three times its standard error? 

7. Develop a formula for the standard error of V (the cocdicient of varia- 
tion). Suggestion: take logarithmic derivatives, remember that the cor- 
relation between means and standard deviations is zero, and free your final 
formula from all terms except V and N. Compare your formula with the 
accepted one (which you will find in Holzinger’s text and in several others). 
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CHAPTER VI 
THE RELIABILITY OF DIFFERENCES 


The reliability of differences is an even more important matter 
than the reliability of statistics of separate groups. For cus- 
tomarily we wish to make comparisons and then we need to 
know, when we find differences by such comparisons, whether 
they can be explained on the basis of chance fluctuations alone 
or whether they indicate true differences. 


THE STANDARD ERROR OF THE DIFFERENCES BETWEEN MEANS 


We may regard our means as deviations from the means of 
all the samples of their respective series, and this will simplify 
the algebra. If Sis the number of samples we get, by the ordi- 
nary definition of standard deviation when the items are in 
deviation form, 


n — 2(m, — m)? _ Em , Im? 2mm, 
Tnm, Ser eer as 5 
The first term is 2, and the second term om, In the third term 
we shall multiply both the numerator and denominator by omOmy 
and have 
_ 22mm, 
SomF my 


ohm, = 08, + 08, “mom 
As a part of the last term we now have the r between the means, 
so that we may rewrite the expression as follows: 


(A) Baoe Sk, bcd, E 


We have next to seek a simpler value for the r between the means, 
which appears as a disturbing factor in formula (A). If we let 
Tı, Ta, ©3,... , Ln, represent the successive scores within a 
sample of the x series and a similar arrangement represent the 
scores in the corresponding y series and we conceive these scores 
as deviations from the means of the whole aggregate of samples 
in their respective series (as we must if we are to be consistent here 
with the conception of the means as deviations employed above), 
160 
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we shall have 


Tmamy 
Drt te tat + + + +n) (Yi tye bys + * + + + Yn) 
n n 
So mF my 
Der + 2+ ast 2 + to) Yi t y2tys+ ** + + Yo) 
EN N SOmmy 


When we multiply together the two factors in the numerator, we 
shall get two types of products, those involving the paired items 
and those involving cross products between nonpaired items, as 
follows: 


Tmamy 
D(xays + raya + cays + ++ + + tan) 
+ Saye + riya t+: + 21+ eee + tnYn—-1) 
2? SomsO my 


It would not be far from correct to say, as is customarily done, 
that the second type of products sum to zero since the items that 
are multiplied together are uncorrelated. But that is not strictly 
true. We have a situation precisely like the one encountered in 
our preceding chapter in connection with the standard error of a 
mean (page 131). If carried through a process similar to the one 
employed there, our development would arrive at the following: 


0 maT m mmy = -N 1— ial 


Dividing through by the coefficient of 7, and then substituting the 
value we found for the standard error of a mean, 


n= 1 
Daryn ( T wey) 


Gai 6. n—1 č n— 1 
z os. me v ea are 
wn (Se i e T 
Since the n is the same for each sample and the N constant 


throughout our problem, certain factors containing these terms 
will cancel, and we are left with 
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Ezy 
Nõõy 


Tmamy = 
Since N is the total population, >zy the sum of the products of 
paired items for this total population, and ¢,¢,, respectively, the 
standard deviations of the two total populations, we have 

i Coefficient of correlation between mean: 

Tmemy = Pay f in E AI serjes) S (89) 
Thus it is proved that, for all samples combined, the r between 
means of successive constituent samples equals the correlation 
of the paired scores for the whole population. We do not know 
the p for the large population N, but we may take our Tey from 
the sample in hand to represent it for our purpose. Substituting 
Tay fOr Tmém, in formula (A), page 160, and extracting the square 
root of both sides of the equation, 


(Standard error of the 
Omm = VO%, F oh, — Woy mmy difference between (90) 
s a! two means) 

This is the formula that should always be used when calculating 
the standard error of the difference between means of groups 
matched on some criterion so as to involve the presence of an 
element of correlation between the groups compared. Unfortu- 
nately the tail of this formula, containing the r, is often omitted, 
in consequence of which the standard errors as calculated are too 
high. Of course, if the two series should be uncorrelated, the r 
would be zero, and the formula would become, since the third 
term would amount to zero, 


= (Standard error of the difference 
Cme—my = [a + a, between two means in case of (91) 


uncorrelated series) 


In either of these formulas the value of of, as developed on 
pages 133 to 135 of this volume is to be substituted. It was 
shown there that, if the successive samples are to be chosen at 
random from a large population, o3, = 62/N; if the successive 
samples are matched with one another on a true criterion so that 
they are correlated with one another, then 


3 = (2) (ite); 


where the r is the reliability coefficient of the measuring instru- 
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ment; if the successive samples are matched with one another 
on a fallible criterion so that they are correlated with one another 
by reason of being correlated with the criterion, then 


Pe) 
g; 
o, = E Ea 


where the r is the coefficient of correlation between the matching 
criterion and one of the samples. Using first the simplest of 
these cases, we have, by substituting in formula (90), 


(Standard error of the dif- 
ference between means 


_ [B® +; — 2rayo.ty of two series correlated (92) 
Tms—my = with each other but suc- 
cessive samples in the 
same series random ones) 


Note that the N belongs under the radical sign.* 

This is the form that is likely to be most needed in practice. 
For when, say in an experiment, we wish to compare the mean 
success of our two groups, we ordinarily wish to know what could 
be expected to happen if we drew at random other groups, subject 
only to the condition that they be matched with each other, and 
put them through the same experiment. Nevertheless we might 
have involved the other types of situations, and we shall include 
them here for the sake of making the issue thoroughly clear. 
Suppose, for example, we measured the difference in attainment 
in arithmetic between a seventh-grade group of pupils under one 
teacher and a seventh-grade group under another teacher and 
meant by our statement of the reliability of the obtained differ- 
ence what could be expected to happen if we took many measures 
of the same kind of these same two groups. Then the formula 
next to be stated would be appropriate. Substituting for o2, 
in the case of successive groups matched on a true criterion 
(because in each series the groups are the same in the successive 
samples), we get? 

1 This form holds only if the N is the same in both the series. Otherwise 
we must write 


. Om, m, = 


2 The rmm would be zero between two series of means which remain on & 
constant level except for random fluctuations in successive samples. 
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621 — rou) , 23(1 = Ty) 
onm = af 4 BE (93) 


(Standard error of a difference between 
means when successive samples are 
matched on a true criterion) 


If the samples are not the same pupils repeating the experiment 
but instead they are pupils of the same ability as those upon 
whom the first experiment was done (if, that is, they are always 
matched with the original groups in their own series on a fallible 
criterion, so that the successive groups would always have the 
same educational age or the same general intelligence or the 
same socioeconomic status as those of the earlier experiment), 
then the formula would become! 


con SEE o 


(Standard error of a difference between 
means when successive samples are 
matched with the initial one on a 
“fallible” criterion) 


Where the groups between which differences of means are 
taken have been matched by matching individuals, which is the 
case in well set up experiments (see page 448), the standard 
error of the difference between the means can be put into a form 
far simpler for computational purposes than the above. For 
here we have paired scores so that the differences may be taken 
between these paired scores, and operation with these d’s makes 
unnecessary the computation of a coefficient of correlation. 
Recall the formula for an r, which we developed on page 101, 


dto- of 
20.0 
We shall substitute this value for r in formula (92), cancel 


terms that permit canceling, and find an extremely simple 
formula resulting. 


r= 


ee okaso oy 
4 = a z 2o.0y 


Tay N-I 


1 This is the formula given by Lindquist, although he does not indicate 
the limitation under which it must be used, See J, Educ. Psychol., Vol. 22, 
pp. 197-204 (1931). 


20y 
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2 
o; 

Tan = NEI (Standard error of the difference 
: between means in terms of the 5 
differences between paired (95) 


Omz—=my = ania scores) 

VN =1 
The (N — 1) instead of N is important in small samples, and in 
very small samples ¢ should be used with Student’s distribution 
instead of the normal distribution. 

Thus, in the case we shall ordinarily be dealing with (successive 
samples chosen at random but the two series matched in each 
sample), the standard error of the difference between the means 
of paired groups turns out to be merely the standard deviation of 
the differences between paired scores divided by the square root 
of the number of such paired scores. Although this formula 
relieves the worker of the necessity of computing any r between 
the series, it takes full account of the value of ther. In a later 
chapter on the technique of controlled experimentation we shall 
show that there are many additional advantages accruing from 
this arrangement beyond the one of ease of operation, If the 
two series are not correlated, the r that equals zero will auto- 
matically take care of itself just as well as any other r. We 
shall show an example of this method of computing the standard 
error of the difference between means by utilizing a table from a 
controlled experiment on the effect upon achievement in geometry 
from requiring failing pupils to remain after school hours.* 
In the first column are shown the “standard scores” (z scores) 
of both members who made a pair indicating their prospective 
capacity to learn, the “free” group on the left, and the “kept” 
on the right. The other columns give in succession the score 
on the test earned through the semester by the “free” pupils, 
the score by the “kept” pupils, and the differences. It is this 
last column from which we compute our standard error. 

The difference between the means of the two groups is 
(15.8 — 15.4 = 0.4). The standard deviation of the column 
of differences is found, by calculation, to be 3.28. This divided 
by the square root of 22, which is the number of cases minus 1 
(to take account of the fact that the numerator is o instead of õ), 
will give the standard error of the difference between the means. 


1 From a master’s thesis at Pennsylvania State College by Bertha A. 
Swartz, 
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3.28 divided by +/22 gives 0.7. Thus while the difference is 
0.4, its standard error is 0.7, so that the difference is only 0.6 
of its standard error. From this comparison we conclude, 
therefore, that, while there is a slight difference found against 


TABLE XI.—ILLUSTRATING THE COMPUTATION or THE STANDARD ERROR OF 
A DIFFERENCE 


Matching scores Attainment scores 
Differences 
Free Kept Free Kept 
1.63 1.64 22 16 6 
1.24 1.19 21 21 0 
0.98 0.94 18 15 3 
0.74 0.68 16 16 0 
0.55 0.58 16 20 - 4 
0.54 0.49 17 19 = 2 
0.37 0.37 18 13 5 
0.12 0.13 15 16 - 1 
0.30 0.45 17 17 0 
0.09 0.13 16 16 0 
0.24 —0.04 14 13 1 
—0.24 —0.20 14 18 — 4 
—0.25 —0.25 14 17 = 3 
—0.33 —0.33 17 11 6 
—0.37 —0.39 16 15 1 
—0.40 —0.59 18 13 5 
—0.54 —0.37 14 14 0 
—0.33 —0.29 9 16 — 7 
—0.13 —0.13 16 14 2 
—0.81 -0.90 10 13 — 3 
—0.91 —0.91 14 14 0 
—1.24 —1.40 15 12 3 
—1.92 —1.69 16 15 1 
Means. cies sil jena 15.8 15.4 0.4 


keeping after school, it is a difference of practically no statistical 
significance. 3 

It is suggested that the reader work out the standard error 
of the difference between the means for this problem by the long 
formula (92) and prove to himself that the two formulas give 
identical results and that the short method is far more economical. 
The short method necessitates pairing individuals but so does 
the calculation of the r for the long method. However, in some 
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problems involving very large numbers of cases, or in cases where 
the n is different in the two populations, it may be enough to 
determine our r from a sample of the whole population, in which 
formula (90) may be more economical. 

Standard Error of the Difference between Mean Gains.—Very 
frequently we have a situation in which we measure gains made 
by groups through a period of time and wish to determine the 
statistical significance of the difference between the mean gains 
of two groups. Thus we may be interested in comparing the 
mean gain in speed of reading made in a semester by a group of 
pupils of given average mental age with the mean gain made by 
a control group that has had no such drills but which control 
group has been matched with the drill group on some such index 
of learning ability as IQ’s. This is a more complex problem than 
the one where we merely compare one mean with another. 
Letting x represent the scores of the one group (in deviation 
form) and y those of the other group, 1 indicating the scores ati 
the beginning of the period and 2 those at the end, the following 
formula would state our case: 

Z(mz, — Ms, — My, + My,)? 


O? ime mz imm) = 5. 


If the reader will square the polynomial of the four terms and i 
carry through a process of substitutions parallel to the one we 
did above in connection with the standard error of a mean, he 
will arrive at the following formula: 


1 
Oime; me) (my my) = KAN 
— 2ra aða Tz, — fy udu F Rrap abu, — Wey F2Fv, 
(Standard error of the difference 


Wey Fe Fy, — eyð)? between mean gains by cor- (96, 
F rewan emia? n) related groups) (2%) 


(63, +38, +95, + 2, 


This is the correct formula which must be used for exactness 
when employing the conventional method. Lindquist gives 
this formula with the last four terms of the tail omitted.1 It 
is true that, since two of them are positive and two negative, 
they would largely cancel one another; but not completely So, 


1 Lanpquist, F. E., “On the Determination of Reliability in Comparing 
the Final Mean-scores of Matched Groups,” J. Educ. Psychol., Vol. 20, p. 105. 
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unless the two groups are independent. We shall show that an 
immensely simpler formula is identical in value with this long 
one. Using the same symbolism as above,* 


(te — a1) — (y2 — yı) = (Gz — ) = dy 


the difference in gain in the case of one particular individual. 
Summing for all individuals in the group and dividing by N, 


Daa L Srii _ (Zy2_ Bi) _ ( 292 __ 21) be ee) 
N N N N N N N 


Therefore, by reason of the meaning of a mean, 


(mz, Ve Mz,) ir (my, ia my) = (mo, rr Ma) = Ma, 


We may now express, for a series of samples, the standard 
deviation of each of the quantities between which equality is 
indicated in the above equation. Since the items that constitute 
the sigmas for the three series involved in the above equation 
are severally equal, the sigmas obtained from them must be equal. 
Therefore, 


F(mz,—ms,)—(my,—my,) = Olmo m) = Oma, 


But the standard error of any mean, including, of course, the 
standard error of the mean of the d’s involved in the last expres- 
sion at the right in the equation last above, equals the standard 
deviation of the distribution divided by the square root of the 
number of items. Therefore, Sm, equals 4,/VN, where d 
stands for the differences between the paired gains between the 
æ and the y series of the sample in hand. Thus we obtain a very 
simple formula for the standard error of the difference between 
mean gains in correlated series, parallel to the one for means, 
as follows: 


a, (Short formula for the standard error of 
Timom) = the difference between mean gains in (97) 
JN „correlated series) 


Sometimes the statistical significance of the sum of means is 
wanted instead of the difference. Sometimes, too, what is 
1 A derivation along the same lines as that given on p. 164 is also available 


here; but the proof is considerably more lengthy and the simple derivation 
given here seems sufficient. 
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wanted is the standard error of the sum of gains instead of their 


difference, especially in rotation experiments. Whatever the 


combination of means involved, the worker can make his own 
formula by constructing a polynomial from the combination of 
means designated and setting up a formula parallel in structure 
to our long ones (90) and (96), carefully watching the signs. 
But in every case of matched groups, no matter how complicated 
the combination of means required, a corresponding short formula 
may be employed, which will be algebraically identical with the 
long one and will involve the full force of each r, by merely 
performing for each combination of paired variates the algebraic 
additions called for among the means, taking the standard 
deviation of these sums, and dividing this standard deviation 
by the square root of N. Thus 
_ & (The standard error of any (98) 
A aaa a Up spanish tere Si 

The operation of these formulas will be further illustrated in 
connection with a later chapter on control'ed experimentation. 

Interpretation of the Standard Error of the Difference between 
Means.—Perhaps it may be well to pause here again to consider 
the interpretation of the standard error of a difference between 
means. The interpretation is essentially similar to that of the 
standard error of a mean except that we are interested almost 
exclusively in the relation of our difference to a hypothetical 
difference of zero. We shall take as our concrete example the 
one shown in the table from Miss A ‘ 
Swartz’s study of keeping after D 
school, page 166. Here the differ- 
ence was 0.4 and the standard error 
of the difference 0.7, showing a 
slight advantage for not keeping c 
after school. We are interested in 0 0.4 
knowing what the chances are that, EE 
with further sampling, the advantage may not descend to 
zero and pass to the other side. We are interested, then, 
in the hypothesis that the true mean may lie as low as zero. 
We shall construct a normal distribution of assumed differ- 
ences with the mean at zero. If the true mean of all the 
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differences were at zero and the standard error of that mean 
were 0.7, some differences would go as high as the 0.4 we, 
obtained from our sample, viz., all those above CD. In the 
trapezoid ABCD with a base of 0.4/0.7 = 0.60 there lie 22.5 per 
cent of all the cases in the distribution. Thus above the point C, 
at which our obtained difference lies, would be 50 — 22.5 equals 
27.5 per cent of the cases,-while below that point would lie the 
50 per cent in the lower half of the distribution plus the 22.5 in 
the trapezoid or 72.5 per cent of the cases. So out of every 100 
samples 27.5 would be expected to give differences of 0.4 or 
higher even though the true difference were zero, while the other 
72.5 would give differences of less than that. If, however, the 
true difference is as low as zero, something has happened to us 
in this experiment that would happen only 27.5 times out 
of 100. The chances are 72.5 to 27.5, or 2.6 to 1, against such 
a coincidence. These chances are something, but they fall far 
below giving us practically complete assurance that we have 
not gotten the advantage on the side we did merely by reason 
of chance fluctuation; hence we say that the difference has 
negligible statistical significance. 

This is the type of interpretation that we shall practically 
always want to place upon the standard error of a difference. 
A number of writers of elementary textbooks on statistics give 
for practice elaborate problems about the chances that the true 
difference is not less than a certain amount or more than a certain 
amount or between certain specified amounts when the obtained 
difference and the standard error are specified amounts. But 
such problems are rather artificial; the authors of this book 
have never yet encountered a practical research problem in 
which such interpretations were needed. Besides, it will be 
found that the solutions intended by these writers for their 
artificial problems turn upon placing the obtained difference at 
the middle of the distribution—a wholly unwarranted procedure 
—which fundamental error makes fairly simple the statement of 
a solution that, while not impossible, is difficult and awkward 
when correctly put. 

What we have said about the interpretation of the reliability 
of a difference between means will hold true for all the other 
differences we shall discuss throughout the remainder of this 
chapter—subject to what is said later about small samples. 
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STUDENT’S DISTRIBUTION FOR SMALL SAMPLES 


Since the N in the example we have been using here is rather 
small (23 cases), it will constitute a good one for showing the 
application of Student’s distribution for small samples and for 
comparing the interpretation by the small-sample technique with 
that for large samples. The assumption we made above that a 
large number of means (or differences, or. other statistics) could be 
expected to group themselves in a normal distribution about the 
true value holds approximately for small samples as well as for 
large ones. If we could divide the deviation of the mean (or 
other statistic) by the true standard deviation of the whole 
population of such statistic to get £, the ts would also be normally 
distributed. However, we do not know this population standard 
deviation but must use instead an estimate of it, s. When t's 
are obtained by dividing the deviation of a statistic by s instead 
of &, their distribution is no longer normal. In 1908 an English 
scholar who modestly signed his name Student worked out 
mathematically the distribution! .of ¢ (which he called z) when 
thus obtained by dividing the deviation of a mean from the 
hypothetical true value by the standard deviation of the sample 
instead of by é. He dealt with the distribution of means, includ- 
ing in particular the mean of a set of paired differences like that 
of our illustrative exercise. But it is now known that the same 
distribution holds for other statistics as well. The distribution 
is symmetrical about the true value, just as the normal distribu- 
tion is; but it is more leptokurtic than the normal distribution 
and is different for each n. As n increases, the distribution 
approaches normality. ‘The reason is that s, the estimate of the 
population variability, becomes a better and better estimate of ë 
as n increases in size and s approaches č as n approaches infinity. 

Student found the standard deviation of his distribution to 
be1/ VN — 3. His $: differs somewhat from the 3 of the normal 
distribution but rapidly approaches 3 as N increases.* In his 


1 §rupenz, “The Probable Error of a Mean,” Biometrika, Vol. 6, pp. 1-25 


(1908). g 
2 For a more detailed, and yet simple, discussion of the small-sample 


technique we suggest L. H. C. Tippett, The Methods of Statistics, Williams 
and Norgate, 1937, Chap. 5. 


Ha 
SPa SNN, 
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original article he carried his table only to N = 10, showing 
that beyond that point a good approximation is reached by 
dividing the ø of the sample by VN — 3 instead of by VN — 1 
and using the normal curve tables. But later (1917) he exténded 
his table to N = 30, and still later (1925) Fisher, with Student’s 
blessing, redeveloped Student’s integral in terms of (N — 1), 
and Student made new tables. It is these 1925 tables from which 
our tables in the Appendix (Tables XLV and XLVI) are taken, 
although we table the tail of the distribution from ¢ to plus 
infinity while Student tables the area under the curve from minus 
infinity to t, so that our values are his subtracted from one. 
Our table gives directly the probability of obtaining, on the basis 
of chance alone, a ¢ as large as the one in hand deviating in the 
same direction from the hypothetical value as the one in our sample. 

Student carried his 1925 table only to n = 20 (n’ = N = 21). 
His reason was that beyond that number the shape of his distribu- 
tion was so nearly normal that the normal curve tables give good 
enough results, provided one divides the standard deviation of the 
sample by VN — 3 instead of by VN — 1. But Student pro- 
vided a supplementary table for dealing with n’s above 20, if 
precise probabilities are wanted. We reproduce this as Table 
XLVI. It involves interpolating from an n of infinity as follows: 


cı Ca C3 C4 
POT Pana En 1 pa ae 


where p is the desired probability, p,, is the value given in the 
last column of Table XLV for n = infinity, and the c’s are 
given in the body of Table XLVI. Although Student’s tables, 
which were published in Metron (Rome, Italy) in 1925, have been 
little used and little known by American research workers, they 
contain the bases for the most precise interpretation of probabili- 
ties anywhere available for the type of application for which they 
were intended. (Student reports that the corrections provided 
in his table of ¢ values, our Table XLVI, give approximations 
to the order of 0.000005.) ‘Fisher’s table of the distribution of 
Student’s t, which is the one best known in contemporary 
practice, is intended for hurried use in making rough interpreta- 
tions. Both have a place among research tools. We shall show 
how to use both tables, employing the data from Miss Swartz’s 
experiment for the purpose. We shall compare the values from 
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both of these tables with the values from the normal distribution. 
Let us consider first Student’s table. 

The N in Miss Swartz’s experiment was 23 (n = 22), and the 
t was given (page 166) as 0.6 in round numbers but more precisely 
should be 0.572. In the normal curve for at of 0.572 we have in 
the upper tail of the distribution (Table XLIV) 0.2836, which 
means that we could expect to get as large difference as we did 
in favor of the ‘‘free” class about 28 per cent of the times (28 
times in 100) even if the true. difference was zero. We now 
compare that with thé probability indicated by Student’s 
distribution. Going to Student’s table (page 489), we do not 
find a column for n = 22. So we must interpolate as stated 
above, which we do from the last column in Table XLV aided 
by Table XLVI. Since we do not have a row for t = 0.572, we 
must find values for ¢ = 0.5 and then fort= 0.6 and interpolate. 
For i = 0.5, 
0.0550102 _ 0.008509 _ 0.00697 , 0.0022 

22 2 a tay 
= 0.3110 


p = 0.3085375 + 


For t = 0.6, by the corresponding formula, p = 0.2773. By 
linear interpolation between these for ¢ = 0.572, p = 0.2852, 
to four decimal places. More exact work would require more 
refined methods of interpolation than the linear one. But 
practical purposes in research would probably never require 
such refinement. 

In this problem the discrepancy between the results by 
using Student’s distribution and those by using the normal curve 
is wholly negligible; the probability is 0.2852 by the one method 
and 0.2836 by the other. But the discrepancy would be much 
greater farther out in the tail of the distribution. For example, 
if the n is 40 and ¢ = 4.0, the odds by the normal curve table are 
32,000 to 1 while those indicated by Student’s table are only 
7,500 to 1, which is a tremendous discrepancy. If the odds are 
very high and it seems worth while to state them, they should be 
determined from Student’s table rather than from a table of the 
normal distribution integral even though the n reach several 
hundred observations. Otherwise the estimated odds may be 
greatly exaggerated. (See page 170 for meaning of odds in con- 
trast with probability.) 


i 
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We shall now illustrate the use of Fisher’s table with the 
same data. Wegive this table on page 173. Since Fisher’s table 
is intended for only hurried, rough interpretations, it gives 
values at only certain significance levels, and it is sufficient: 
to estimate the probability in relation to these levels. We 
enter Table XII, page 173, with n = 22 and follow along row 22 
until we come as near as we can to 0.572, which is the value of 
our. Under column headed 0.6 we find a tof 0.532 and under 
column headed 0.5 a t of 0.686. Since our ¢ of 0.572 lies between 
these two, we say that the probability is somewhere between 
60 and .50. This is the probability of getting as bad arithmetic 
fit as we did, even though the true difference were zero. That is, 
it gives the probability of getting on the basis of chance fluctua- 
tion a t that deviates either positively or negatively from zero as 
far as the one we have in hand, hence it gives the sum of the areas 
in both the upper and the lower tail of the distribution of t's 
outside the range +t. To find the probability of a divergence 
in the same direction from zero as that of our sample and hence 
to make the interpretation comparable with that made above, 
we must divide these entries by two. Doing this, we find the 
probability to be somewhere between .25 and .30, which agrees 
with the determination of .2852 from Student’s table. Fisher’s 
table makes no provision for probabilities lower than 0.01 (or 
.005 on one side), which corresponds to odds of 199 to 1, nor for 
n’s higher than 30. 

The probability of a true difference beyond zero in the above 
problem is so low that we were not justified in elaborating 
upon it in the refined manner we employed. We did that merely 
to illustrate the method. The proper thing would have been to 
dismiss the difference as insignificant immediately upon finding 
the very low #, or at any rate to investigate only roughly the 
probability of a true difference above zero. Many people 
believe that a point of reference should be set up as a norm for 
acceptable reliability. For many years American students have 
set a t of 3 as such standard. They called a difference reliable if 
it was three or more times its standard error and unreliable if it 
fell below that point. That is a very exacting standard; it 
demands that the odds be at least 740 to 1 that the true difference 
is above zero in the direction of the obtained one before it be 
accepted as reliably established. The practice that follows the 
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Fisher lead puts acceptable reliability in terms of fixed probabili- 
ties rather than in terms of fixed abscissa values, so that the 
technique may be extended to small samples as well as to large. 
As we saw above, the area in the tail of the distribution of t is 
dependent upon the size of the samples, and so it is convenient to 
keep the probability constant while n changes. Most of the 
Fisher tables are constructed on the principle that we care only 
to know if the probability is as low as 5 per cent, or if as low as 
1 per cent, that the difference could have arisen by chance. 
If it reaches 5 per cent, he calls it significant; and if it reaches 
1 per cent, he calls it highly significant. In view of the manner 
in which the Fisher tables are constructed, his 5 per cent cor- 
responds, in the case of a normal distribution, to a ¢ of 1.96 and 
his 1 per cent to at of 2.58. 

There may be some practical convenience in having some such 
commonly understood points of reference to mark “limits of 
confidence.” But we wish emphatically to warn our readers 
that any such limits are entirely arbitrary. Nothing happens 
at these points that is unique. Reliability is a matter of degree; 
the larger the ¢ the higher the reliability. To employ these points 
of reference mechanically is quite misleading and unwarranted. 
On the contrary, to state the issue in terms of probabilities 
and secondarily to observe that the ratio falls short of, or reaches, 
the conventionally accepted standards, is an effective way of 
preventing one’s thinking from becoming overmechanical. 

If, instead of caring to test the hypothesis that the true 
difference may be zero, the worker is interested in determining 
the limits between which he may claim, with a given degree of 
confidence, that the true difference lies, he should recall what 
was said on pages 137 to 189 about fiducial limits. 


THE STANDARD ERROR OF ANY DIFFERENCE 


Since the formulas for the standard errors of all differences are 
fundamentally alike, we may as well consider at once the general 
case. Let œ stand for any statistic we please (mean, standard 
deviation, coefficient of correlation, proportion, or what not) 
and w for any other statistic, whether of the same class or of a 
different class. Then, if we conceive these as deviations from 
the means of their respective series, 
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a h Zle= 6)? 1 Za?) Bw? 4 2Faw 
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In the expression farthest to the right, the first term is a, and 
the second is ¢%,. We can put the third into a form that involves 
an r if we multiply both the numerator and denominator by 
Tadu. Making these substitutions, we have 


Law 
Scat 


Pau = Ta + 074 — 2 ` Cafu 
The reader will recognize in the expression Zaw/SoaT, the value 
Tao». Substituting this and taking the square root 


Caa = Vere F a = rautan Stand monetary (99) 


In all our further developments we shall need only to substitute 
our particular statistics for the œ and the w. The standard 
errors of the individual statisties were given in our preceding 
chapter. Our new task will center in finding a value for the 
Taw for the several statistics so that we may substitute it in the 
general formula. If we were concerned with the sum of statistics 
rather than with their difference, (a + w) would replace (a — w) 
in the above development, and we would have 


Cut = Vaa E oF Mautara Simeone sr (100) 


THE NULL HYPOTHESIS 


Formula (99) and adaptations from it operate on the principle 
that the two classes compared might be different, and hence we 
estimate for each its own variance. We next concern ourselves 
with both the extent of the difference and, perhaps, with the 
possibility that the true difference might go down as far as zero, 
or even have the opposite sign from the one obtained in the 
sample. We could, however, start with the assumption that the 
two samples may have arisen merely as chance fluctuations in 
drawing samples from the same homogeneous population and 
that, as such, there could be no difference between them except 
what such chance fluctuation could explain. This is the null 
hypothesis, as applied to differences. It can, of course, be 
extended to include the average difference among a plurality 
of classes (as in analysis of variance) or to the reliability of a 
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single statistic (as when we wish to ask whether an observed r 
could have arisen out of a situation in which the true r is zero), 
or to other types of situations. 

Our problem would then become that of testing the null 
hypothesis—of confirming or refuting it. Since we assume that 
we are dealing with a homogeneous population to which all of 
our classes really belong as samples, the true ¢ of the numerator 
of our standard-error formula will be the same for all samples 
and can best be predicted by averaging the moments from the 
several samples. Thus, for our two samples, 


Da? + Zr? bt Da? + Er? 
MDN) NitN2-2 


= 


where each « is taken as a deviation from the mean of its own 
class, and N is the number of individuals in a sample. Since the 
s would be the same for both classes and the classes would be 
conceived as independent samples, the s could come outside the 
radical, and we would have, for the standard error of a difference 
between means,! 


I I ABtendard error of a difference í 
Om-m = Sal — etween means, assuming (10 
ie Ni Ne the null hypothesis) ion 


Other difference formulas would take corresponding forms. 

It is entirely possible to think of the reliability of differences 
thus in terms of the null hypothesis. Its essence consists in 
speculating upon how far sample statistics might reach from 
zero in a homogeneous population and whether some samples 
from such population might reach as high or as low as the one in 
hand. If so, there can be no assurance that a real difference 
between two populations exists because the behavior of a single 
one could possibly have given rise to the apparent difference 
between the samples. Such procedure may serve rough purposes 
well enough, and some people think it is a little easier to handle 
than the more refined methods of classical statistics. But it 
sometimes leads into rather farfetched and awkward conse- 
quences. Fisher points out that it sometimes enhances the 
value of ¢ and thus leads to a more exacting test than legitimate; 
and also that it may give results so discordant with those of 

1 Student’s t table is to be entered with n = Nı + N: — 2. 
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the correct method that one or the other must be ignored. The 
more general methods, discussed earlier in this chapter and later, 
obviate these aberrations. In the problem cited in this para- 
graph, for example, in which Fisher finds discrepancy between 
Student’s formula and formula (101) above, completely consistent 
results would have been obtained if he had used the formula 
correctly taking account of the correlation element [formula (90)] 
and Student’s formula [which is the same as our formula (95)]. 
Formula (90) is the absolutely general case; formulas of the type 
of (91) are specialized in the sense that they apply only where 
there is no correlation; and formulas of the type of (101) repre- | 
sent the still more limited case where it is not too farfetched 
to assume that a combination of the moments from the several 
samples can predict a variance common to them. The more 
general case adds so little extra labor that it does not seem to us 
worth while to employ the cruder method of the null hypothesis. 
` To envisage the behavior of statistics in successive samples; 
to take cognizance of the influence of correlation in restricting 
the fluctuations of those samples; to estimate for each statistic 
its own most likely population value instead of merely averaging 
the two together; to raise the question whether with very large 
populations these means might occupy the same position so that 
the difference would be zero; and to raise corresponding questions 
about the variabilities of the samples, about their skewness and 
kurtosis, and about the possible extent of differences with which 
other arrays correlate with each of the factors; all seems much 
more satisfying and meaningful than merely to say that, if there 
were a homogeneous population giving rise to samples of a certain 
size, some of these random samples might show as great differ- 
ences as the one we have in hand. Certainly that is true if the 
samples are large enough to give any dependability to the 
statistics in hand—say, 30 or more cases in each sample. 

These are two different approaches, and each has its plausi- 
bility. The null hypothesis is especially useful for rough explora- 
tory research in which relatively small samples are used. For 
constructive research, especially with large samples, the more 
elaborate techniques of classical statistics are needed. We 
continue with the elaboration of these, developing the applica- 
tions of formula (99) to the several types of statistics. 

' 1¥Fisner, R. A., Statistical Methods, 7th ed., pp. 129, 133. 
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STANDARD ERROR OF THE DIFFERENCE 
BETWEEN STANDARD DEVIATIONS 


In formula (99), page 177, we provided for the standard error of 
any difference. For our present purpose we need only substitute 
in this general formula cs for œ and cy for w. We shall then 


have 
Otos) = V Ox + Ooy — 2WosoyFosFay 


We know the sigmas of the sigmas (se = ¢/+/2N), so that we 
require only the coefficient of correlation between standard 
deviations in series of correlated arrays. We shall now proceed 
to determine a value for Tessy, It is most convenient to approach 
that through reso, and then to return to the unsquared o’s. 

We shall take our a’s and y’s as deviations from the means of 
their respective samples; but the «”’s and the y”’s will, of course, 
not then be in deviation form. Our formula for r will be 


Sz Z? 2y’) _ 22r? 22y? 
N oN N N 


Toteaty = N2 NET 2 Aft 
> (22? g Dg? ey DUAN EE Dy? 
LG- CF Tle Gi) -C r) | 


where § is the number of samples and N the population within 
each sample. But it will be observed that in this expression 
all our quantities are of the form Ya*/N or Ly?/N. The former 
is the expression for the mean of the x”s and the latter for the 
mean of the y”’s. We have, therefore, a case of the r between 
means of samples, which was shown (page 162) to equal the r 
between the variates within asample. We may, therefore, write 


NEzy? — 2a? dy? 
VNZX(a?)? — (21) NEY")? = (Zy?)? 
Nay? — N'oo 
I INA NANN ay Ne 

In the lithoprinted edition of this book we showed that 
Lary? 

N 
so that, assuming that b: = 3, Day? = No%03(1 + 2r*). We 
shall shortly substitute this value in the r formula. But first 

let us find a simpler value for =x‘ and Sy! of the denominator. 


(B) Totsoty = 


= obsi(1 — r? + bx’), 
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In a normal distribution £: = 3. 
Therefore, clearing of fractions, Zz = 3Nc4. Similarly 
Zy' = 3Nos. 
Substituting these three values in the r formula, we have, 


N[No202(1 + 2r?)] — N*0203 
»/N(3No!) — N%0t\/N(3Not) — No! 
= N%o2a2(1 + 2r?) — N%0203 
VBN% — N'i 3N o4 — Noh 


The term N%c203 can be canceled out of the numerator and the 
denominator, so that the expression will simplify to 


Totzo%y 


7 1 oc LE D ar 
aia V3 = 1/3 — 1 V 20/2 Sr 


This is the coefficient of correlation between the sigmas squared. 
What we desire, however, is the r between the sigmas, not that 
between the sigmas squared. Unfortunately no simple formula 
can be given for the relation between the r. between measures 
and the r between those measures squared. It depends upon 
the origin from which the squared measures are taken. We can 
show, by a process of development which it is scarcely worth 
while to reproduce here, that, assuming homoscedasticity and 
mesokurtosis, 


between squared measures 
in terms of r between the (102) 
measures) 


Tay = 


where oy, is the standard deviation of a column in the correlation 
table while øy is the standard deviation of the whole y distribu- 
tion. As the m increases indefinitely in comparison with the 
o’s, the fraction involving the parentheses approaches 1 in value, 
and our expression becomes 


Toye = all — -¥ 
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But this is also the formula for fey. In our particular applica- 
tion, the mean will customarily be large in comparison with the 
variabilities, so that we may take the r between the sigmas 
squared to be substantially the same as the r between the sigmas. 
Therefore 


r =f (Coefficient of correlation between standard (103) 
Tay zy deviations in correlated series) 


Substituting this value in the formula for the standard error 
of the difference between standard deviations, given at the 
opening of this section, and employing numerical subscripts, 


(Standard error of the 


Tao, = V Pa, + 070, — 2rhoeoe, difference between (104) 


standard deviations) 


If the N is the same in the two series, we may conveniently 
substitute o/+/2N for o, and have 


= of + 03 — 2rieoice 
aaa - \(a +4, tee (104a) 


The application of this formula to an experimental problem will 
be illustrated in our chapter on experimentation, pages 455 to 466. 
See those pages also for a different formula for small samples. 


THE STANDARD ERROR, OF THE DIFFERENCE 
BETWEEN PROPORTIONS 


In applying our general formula for the standard error of any 
difference we need only to know the r between proportions in 
correlated arrays, since we already know the standard errors of 
the separate proportions (page 145). If, in computing propor- 
tions, we look upon each individual as scoring one point when 
present and no points when absent, as is the customary way, a 
proportion equals 2z/N, where WN is the whole population and 
Xx is the number present in the count. The mean score would 
also be =2/N, where the symbols have the same meaning. The 
correlation between proportions would, therefore, be the same 
as the correlation between means, which we have already shown 
to be equal to the correlation between the variates within the two 
matched samples. That is, Top, = Tay Our formula would 


then become 
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y 2 
Cpap = VO, + o>, 7 Teyp Tp, 


N Pals 4 Palle ar, ee 
= Y 


a Le of the difference between pro- (105) 
portions in the case of matched EEE 


In groups selected from the two populations at random instead 
of matched on some criterion correlated with the outcome with 
regard to which we are measuring proportions, the r would be 
zero, and the tail of the formula would drop off, so that we would 
have 


Nz v 


It is seldom that the former of these formulas can be employed, 
for it is seldom that we know the correlation factor when dealing 
with proportions. The possibility of its use may be illustrated 
from a study by Freeman and Hoefer on the influence of motion 
pictures upon conduct.! They match two groups of children on 
information and intelligence test scores, then they show to 
one group motion pictures propagandizing for clean teeth. 
After the lapse of sufficient time to permit the instruction to 
function, they ascertain, among other things, what proportion 
of each group possessed toothbrushes. It was found that 99.48 
per cent of the group that had seen the motion pictures owned 
toothbrushes while 97.65 per cent of the nonmovie group pos- 
sessed them. The investigators do not report the coefficient 
of correlation between the two groups in respect to owning 
toothbrushes when matched for information and intelligence; 
it would need to be computed by the tetrachoric method described 
on page 366. Let us assume, for purposes of illustration, 
that this r turned out to be .30. We would then have (since per 
cents are proportions multiplied by 100) 


i [eases 4 (97.65) (2.35) 
aan 192 170 


(09.48) (.52) (97.65) (2.25) |? 
— 2(30) [~~~ (792) (170) | 


1 Freeman, Frank N., and Caronyn Hoerer, “An Experimental Study 
of the Influence of Motion Pictures on Behavior,” J. Educ. Psychol., Vol. 22, 
pp. 411-425 (1931). 


on = (a + nes) (106) 
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which equals 1.12 percent. The difference itself is 1.83 per cent, 
so that the ratio of the difference to its standard error is 1.63. 
This is to be interpreted in the same manner as in our illustration 
with the difference between means, page 169. Referring to our 
table, page 486, we find that a ratio of 1.63 indicates chances 
of 18.2 to 1 that a real difference exists in favor of the children 
who had been instructed by means of the motion pictures. if 
we did not employ the tail to the formula but instead ignored the 
correlation element, we would get from the first two terms a 
standard error of 1.27. ‘This would give a ratio of 1.44 between 
the difference and its standard error, which would indicate 
chances of 12.5 to 1 that a real difference exists in favor of the 
motion-picture group. f 

We shall illustrate the application of the second formula 
(106) from a study made by C. N. Rabold on the differences 
between country pupils and town pupils.’ Rabold ascertained, 
among 38 other things, what proportion of 71 high-school 
pupils who came from the open country and of 65 pupils who came 
from a small city are employed after school hours. He found the 
proportion to be 57 per cent for the country and 51 per cent for 
the town, a difference of 6 per cent. Is this sufficient difference 
to indicate that repeated sampling of the same kind of populations 
would continue to show differences on the same side and that the 
theoretical (true) difference obtained from an infinitely large 
population would show a larger proportion of country pupils 
of this type of population employed after school hours than of 
town pupils? Since the groups selected were random ones, the 
formula without the correlation element is the correct one to 
use in getting an answer to the question about reliability. 
Substituting our particular values for the symbols, 


3 [(-57) (43) (.51)(.49) _ 
Opo =a) 7 + 65 = .085 


Thus, while the difference is 6 per cent the standard error of 
that difference is 8.5 per cent. The difference is only 0.7 of its 
standard error. A ratio of 0.7 between a difference and its 
standard error indicates chances of only 3.1 to 1 that the true 
difference lies in the same direction. This is extremely low 

1 Razor, C. N., and C. C. Perers, “How Country Pupils Differ from 
Town Pupils,” J. Ed. Sociol., Vol. 3, pp. 297-306. 
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statistical significance, so that we must conclude that we cannot 
trust a difference so small in comparison with its standard error 
as proof that more country pupils are employed after school 
than town pupils. 


THE STANDARD ERROR OF THE DIFFERENCE BETWEEN TWO 
COEFFICIENTS OF CORRELATION 
Substituting 7’s for the æ and the w in our general formula, 
we get 
Oty = VEE, = Bg oF (107) 


1 u 


From our previous chapter we know the o’s of the r’s. We need 
only a value for the r between two r’s. We would scarcely be 
justified here in taking the space necessary to develop these 
required formulas. Pearson and Filon? give them for the two 
cases as follows: (1) the case in which the same array occurs as 
one factor in both the r’s; and (2) the ease in which the four 
arrays are different, But all the arrays are somehow correlated 
with one another; otherwise there would be no correlation 
between the r’s. The first case is as follows: 


y aoe riotis(L — Ts — Tia — Tia F Qrrarisr2s) (108) 
unaia E 2(1 — iA — rh) 

(Coefficient of correlation between 

two r’s having one array in common) 


The other case involves a considerably longer formula, as 
follows: 


[r — Tiaras) (r24 — Tasraa)] 

MEy [(ras — Pastas) (r23 — T13r12)] | 1 ] 
au 4 [Cri — rira) (rea — Puate)]) L20 = ri) — ria) 
+ [(ria — riara) (ra — To4raa)] 

(Coefficient of correlation between 


two 7’s in correlated series, no (1084) 
array in common) 


We shall illustrate the operation of these formulas with data 
from a study by H. Clair Henry on the reliability and validity 
of the consistent-response method of scoring a true-false test 


1 Ppanson, Kart, and L, N. G. Firon, “Mathematical Contributions to 
the Theory of Evolution,” Trans. Roy. Soc. (London), Series A, Vol. 191, 
pp. 259, 262. 
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as compared with that of the rights-minus-wrongs method.* 
He scored the Peters Test of General Information according to 
the rules for this test, viz., a credit for an item if a pupil responded 
to it twice correctly when stated in different ways and no penalty 
for wrongs, and also by the conventional. rights-minus-wrongs 
method, computed validity and reliability coefficients by both 
methods, and examined the differences for sign and for statistical 
significance. As one test of validity he correlated the scores 
on the test by both methods with the recorded IQ’s of the 
pupils. We shall consider his trial with 90 senior high-school 
students. Evidently this problem comes under our first case 
since, in each of the two scorings, marks were correlated with the 
same array, viz., the intelligence quotients. We shall call the 
IQ array 1; the R-W array 2; and the consistent-response array 3. 
The correlations Henry found were as follows: r12 = .763 713 = 84; 
and rə = .89. The consistent-response method showed a higher 
correlation with intelligence quotients by (.84 — .76) = .08. 
Is this a significant difference? We shall apply to it our for- 
mula (108). 


Prais 


BY got (.76) (.84)[1 — .89? — .76 — .84? + 2(.76)(.84) (.89)] 


20 — .76)(1 — 84) 


= .773 


Putting this value for the r into our standard-error formula we 
have 


Ors 


U= 76)? , (1 — 84%)? a = .76)(1 — .84*) 
K q! ga 0 TAA 90 


= .03 


Thus the standard error of the difference is .03 while the 
difference is .08, making a ratio of 2.67. This indicates chances 
of 263 to 1 of a true difference in favor of the consistent-response 
method of scoring. If we disregard the correlation between the 
7’s and employ the formula in the customary manner without the 
tail, we get a standard error of .054 and a ratio of 1.48 between 
the difference and its standard error. This indicates chances 


1 An unpublished master’s thesis at Pennsylvania State College, 1982. 
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of only 13 to 1 that a true difference is in favor of the consistent- 
response method. Evidently, if we had depended upon the short 
formula which ignores the correlation element between the 7's, 
we would have greatly underestimated the reliability of our 
difference. 

In illustration of our second case we shall use Henry’s figures 
for the difference between the reliability coefficients by the two 
methods of scoring for 100 college sophomores. In this case he 
correlated scores from form A of the test with those for form B by 
cach of the two methods. The arrays were numbered as follows: 
1 is the scores on form A by the rights-minus-wrongs method; 2, 
the scores on form B by the rights-minus-wrongs method; 3, the 
scores on form A by the consistent-response method; and 
4, the scores on form B by the consistent-response method. 
The reliability coefficient by the rights-minus-wrongs method 
would be ry2, while that for the consistent-response method would 
be rs. We are interested in the difference between these two 1's. 
The values of the several r’s needed in the formula are as follows: 
Tia = .52; ra = .645 rig = .61; Ta = 36; ru = 43; Tu = 61. 
When these values for the r’s are put into formula (108a), 
r,,,r, turns out to be .334. When this value of r is used in 
the general formula for the standard error of the difference 
between two r’s [Eq. (107)], the standard error is found to be 
.08, giving a ratio of 1.5 between the difference and its standard 
error and indicating chances of 14 to 1 of a true difference in 
favor of the consistent-response method of scoring. If the r 
between the 7’s is ignored, the standard error of the difference is 
.096, the ratio 1.25, and the chances of a true difference in the 
same direction 8.5 to 1. The use of the tail to the formula here 
makes less difference than in the previous ease because the inter- 
correlations are rather low. 

Thus, to be strictly correct, one needs to employ the formula 
for the standard error of a difference between 7’s that takes 
account of the r between the r’s. But we have taken the position 
that standard errors of r’s are not to be taken so seriously as they 
customarily are, and this would extend to the standard error of 
the difference between the r’s. Since the tail of the formula 
involves the computation of additional r’s probably not needed 
otherwise in the problem and since we are on rather uncertain 
ground here, most people will probably wish to continue to 
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employ the approximately correct formula that ignores the tail 
containing the r between the r’s. In this form 


(Standard error of the difference 
Crta — V Ors aes Fu Sis creat ihe EEEN (109) 
matching i is ignored) 
But when one uses this abbreviated formula where the element 
of matching is present, he should recognize that his obtained 
standard error is probably too high, and possibly much too high. 
The Significance of a Difference between z’s.—The z’ tech- 
nique (see page 155) is especially recommended by Fisher for 
testing the significance of a difference between r’s. We shall, 
therefore, apply it to testing the significance of the difference 
between the z’ values of the r’s we tested on page 186 by formulas 
(107) and (109). The formula is the customary one of type 
(99) with the correlation factor omitted. It is, as inspection 
of the formula for the standard error of z’ [formula (88) page 156] 
would indicate, 


Ziz — Zis 
(Standard error ratio for the 
fe 1 difference between two z's) a 10) 
Na —3 Nu- 


We need first to obtain the z’ value for each of the 7’s. If 
a table of hyperbolic functions is available, such as is printed 
in the Handbook of Chemistry and Physics, the hyperbolic are 
tangent of r may be read directly from it, and that'is 2’. Or 
tables covering certain ranges of z’ may be found in Fisher’s 
manual and in some other books. If no such tables are available, 
the values must be obtained by the use of tables of logarithms 
as indicated by formula (87), page 155. By the use of the table 
of hyperbolic functions in the Handbook of Chemistry and Physics 
we get as z’ for our 7’s 
Tig = .76; zia = 0.9962 
Tis = .84; Zis = 1.2212 


0.9962 — 1.2212 _ _ 14g 
ee ie 
90-3 ' 90-3 


Thus we get as our ratio of the difference between the z's and 
the standard error of that difference 1.48, which is to be inter- 


t= 
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preted by use of the normal curve functions in our familiar 
manner. That is exactly the same ratio as we obtained on 
page 186 by the use of the standard formula [Eq. (109)], and both 
ratios are to be interpreted in exactly the same manner. So 
for all our extra labor we gained nothing, in this problem, by 
transmuting to z's; and we lost all the additional precision that 
formula (107) gave us. We cannot use with z’ a formula of the 
type of (107) because the r between z's is unknown. 


Exercises 


1. From the data given in Table IV, pages 58 to 61, ascertain whether 
girls differ from boys in grade-point average; in scores in history. How 
reliable are these differences? (Note that these are random groups, hence 
no correlation element is present.) 

2. Match girls and boys for general intelligence scores; i.e., for each girl 
find a boy with the same, or nearly the same, intelligence test score. Then 
see whether there are sex differences in grade-point average when the groups 
are thus matched for general intelligence. Do likewise with history scores 
and with scores on the other sections of the test. How significant are these 
differences? (Remember that now the correlation element is present.) 

3. How do the sexes compare in variability: (1) when random groups are 
taken? (2) when groups are matched for general intelligence? 

4. Are there significant differences in the extent to which the scores in the 
several functions correlate with general intelligence test scores? 

5. Revert to the matched groups of Exercise 2. Compute for each 
of the sexes the r between intelligence test scores and science scores (or other 
array in which you are most interested). Is there a statistically significant 
difference between the r for girls and that for boys? [Remember that here 
the matching clement is present and hence formula (108) applies. It is 
considered that the scores on the matching element are perfectly correlated 
between the two groups, so that a single subscript may refer to either set.] 

6. From these same matched groups compute the r between history scores 
for boys and those for girls and also the r between science scores for boys and 
those for girls. Is there a statistically significant difference between these 
r’s? [Note that here formula (108a) applies.) 
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CHAPTER VII 


INFERRING COEFFICIENTS OF CORRELATION 
FOR CHANGED CONDITIONS 


THE GENERALIZED SPEARMAN PROPHECY FORMULA 


We shall first develop a general formula for the correlation of 
the sum of corresponding scores in a sets of similar arrays in an 
x series and the sum of corresponding scores in b sets of similar 
arrays in a y series. The reader is asked to follow critically the 
development in this section because it will be made the basis of 
the derivation of practically all the formulas of this chapter. 

Let 21, Ta, sy . . + y Ta be scores made by one individual in 
the x series (which may be such measures as estimates by judges 
as well as scores on an objective test), and let y1, Y2, Y3, - + + 1 Yb 
be this same individual’s scores in the y series. Let these all be 
conceived as deviations from the means of their respective 
arrays, Then, employing our product-moment correlation for- 
mula in the shape, 

Dry 


T= 
«Zr? - Sy? 
we get for our particular type of data 


T(ærprrpzs. +20) (yrtystyst * +*+) 
D(a + te H Ts + tatt + +a) 
Qnty: tyatyst sts +w) 

GT carte ier Tes F wa)? 

Dyt yty tyros: + y)? 
Multiplying together our polynomials in the numerator and 
squaring those in the denominator as indicated, placing the 
summation sign with each member instead of before the expres- 
sions as wholes, and using a more abbreviated symbolism for the 
r between sums, 


Bayi + Deriye t+ + + Eae H Dray + Zryet es 
is + Saray ti + Bta 
(oat Sgp Beit Demet o ) 

(Zy? + Zy + ika ++ + Iyya t Zym +++) 
91 


192 STATISTICAL PROCEDURES 


Since we are considering the sum of similar samples, we may 
take the sigmas within the z series to be essentially equal to one 
another, and likewise we may take the sigmas within the y 
series to be substantially equal to one another. Let us then 
divide both numerator and denominator of the right-hand 
member of the equation by nooy, these being the typical stand- 
ard deviations of the series to which they belong and the n being 
the number of cases in any one sample. We shall then have 


Zriyi , tyz , Plys, ... 
NO y + NG Dy a NO xSy gi 


EzYı , Veo , ELWa zely, 
T Nowy + NO Ty + Noy 


Testy = 
[Grae vee gp Zesty y Eevee -) 


not ' not ' noe naz noz 
Zyi Zu... q ZUW: Zyys 
hee ee re Sy. Ta 


We have above two types of product moments: those of the 
type 2xiyi/nox, and those of the type Dayr2/no2 or Dyry2/no?. 
The latter represent the correlations within the series of 2 meas- 
ures or within the series of y measures, being the intercorrelations 
among the samples within each set. The former represent the 
intercorrelations between the æ samples and the y samples. 
Since the samples are assumed to be similar within each set, we 
may represent the first type of product moments as Fey, the aver- 
age intercorrelation between the samples, or simply as fs on 
the assumption that these intercorrelations are reasonably well 
represented by any Te that we may have at hand. Those of 
the second type we may represent as ry and Tuy, the average 
intercorrelation among the samples within each set. 

Evidently there are in the numerator ab of the rzy’s because 
each element of the x series, of which there are a in number, 
enters into combination with each element of the y series of 
which there are 6. But in the denominator each enters into 
combination with one less than the whole series (itself being 
excluded by reason of having been used in the z? or y? element); 
therefore the number will be a(a — 1) or (a? — a) on the left 
and (b? — b) on the right. Each expression of the type 2?/no% 
equals 1, since it is equivalent to o2/o2, and there are a of these 
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on the left and b of them on the right. Keeping in mind all 
these equivalents, we may write 


abfzy 


Taan = 
Va + (a? = a)ruz Vb + (0? = b)rim 
(Coefficient of correlation between the 
sum of a samples of anzfunctionand (111) 
the sum of b samples of a y function) 


This is the r between sums. Since we may divide either or 
both arrays in any correlation problem by any constant or con- 
stants without changing the value of r, the r between averages 
is precisely the same as the r between sums. The formula for 
the r between the average scores in a samples of x measurements 
and b sets of corresponding y measurements, where the samples 
are similar within their own series, is thus precisely the same as 
the above for sums. But, when we quote it as the r between 
averages, we shall employ for the r the symbol raJ,. 


RELIABILITY OF AVERAGES 


We have two main types of applications for these formulas. 
The first is the type where x and y are the same function. This 
is the case where a number of judges make estimates on a group 
of individuals and we are concerned to know how closely the 
averages of these estimates may be expected to correlate with 
the sum of estimates made by these same judges, or by others 
similar to them, on another sample of individuals similar to the 
first sample. It is also the case where we have the average 
intercorrelation of a number of forms of a test in hand and wish 
to judge how closely the scores obtained from the sum of these 
forms in hand could be expected to correlate with the sums or 
the averages from any given number of similar forms to be 
obtained in the future. This is the problem of reliability. In 
this type of application all the forms may be taken as similar, 
so that all intercorrelations are of the type ru, whether obtained 
within set a or within set b. Under these conditions formula 
(111) becomes 

H abrir 


Ta = 
Va + (a? — a)r Vb + (b = b)rr 
(Predicted correlation between the averages from a 
forms of a test and the averages from b forms of (112) 
the same test) 
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a= b, formula (112) becomes 


(Predicted correlation between the 
Arir average scores from a forms of & (113) 
1+ @-1ru eceatiba other format the same 
test 


Toa = 


If a is 2, this formula becomes 


PS 2rır _ (Spearman-Brown formula for predicting (114) 


T+ ry the reliability of a test of doubled length) 


Formula (114) is the one we employ when we split a test into 
two halves (as odds and evens), get the correlation between these 
two halves, and then step this up to a prediction of what the r 
could be expected to be if taken between the whole of the test 
and another whole test instead of between the halves. It is 
very widely employed in .calculating the reliability coefficient 
of a test. Mathematically it indicates what should be obtained 
by correlating two forms of a test, provided those forms are as 
closely similar as the two halves are. But in practice it will be 
found to give slightly higher correlations than those obtained 
by correlating two forms. This is because, in the case of split 
halves from the same test, conditions are precisely the same for 
the two halves—same condition of health for a pupil on the two 
halves, same degree of understanding of the instructions, same 
motivation, ete.—while with different forms, given on different 
days or even in sequence on the same day, the conditions may not 
be the same for the two forms. These inequalities in the degree 
and manner to which changed conditions affect different pupils 
will tend to lower the correlation between forms. Sometimes a 
third method is employed in order to ascertain the reliability 
coefficient of a test: to readminister the same form of the test 
after an interval. By this method the coefficient tends to be 
raised by reason of the element of overlapping. Thus relia- 
bility coefficients are highest when the same form of the test is 
repeated at a reasonably short interval, next highest by the split- 
halves method, and lowest by the correlation of scores from differ- 
ent forms. 

Let us return to our general reliability formula [Eq. (111)] 
and carry it through one more step of development; let us make 
b infinitely large. We shall then have the predicted correlation 
between the average scores from a forms of a test, or the average 
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estimates by a judges, and the averages from an infinite number. 
This average from an infinite number may be called the true 
scores, and the r between the a forms and the infinite number 
may be called the correlation of the obtained averages with the true 
averages. An examination of the formula will show that we can- 
not substitute infinity for b directly because that would give 
us infinity in both numerator and denominator, infinity over 
infinity, which is indeterminate in value. We shall, therefore, 
divide both numerator and denominator by b, remembering that 
this must become b? when dividing under the radical sign. Then 


ari 
1 i 
Ver ea ara yit (1-3) 


The 1/b equals zero, since the denominator is infinity. There- 
fore, 


Ta = 


ary ary 


EST Va + (a? — a)ru Vru = Varu + (a — a)rir 


(Predicted correlation between the average scores (115) 
from a forms of a test and the true scores) 


We shall illustrate the application of some of these formulas 
from a study by one of the authors of the influence of motion 
pictures on standards of morality.! The investigation of motion 
pictures involved the necessity of giving them ratings on the 
degree of divergence from the mores with the guidance of 
certain scales having quantitative indices. With these scales in 
hand three judges made ratings of the scenes that fell within 
selected areas. We shall take as a sample the ratings on the 
treatment of children by parents. The average intercorrelation 
of the ratings by the three judges, computed by the method to be 
described in our next section, was .862. When this is entered 
into formula (113), we get, as the predicted correlation between 
_ the averages of the ratings for the several films from these three 
judges and the averages from another three of the same general 
type, the following: 

rm = 3862 
Tb Gi 1) 1862 


1 Parers, Cuartes C., Motion Pictures and Standards of Morality, The 
Macmillan Company, 1933. 


= .949 
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Employing formula (115) for the correlation between the 
averaged ratings from the three judges and those to be expected 
from an infinite number, we get 


g 3 - .862 
4/3 - .862 + (8? — 3).862? 


The meaning of this last correlation is that, for most practical 
purposes where the average intercorrelation among judges is 
as high as .862, three judges are sufficient; for then the joint 
estimates agree to the extent of an r of .974 with the true estimates 
that one would obtain from an unlimited number of judges. 
This is as close agreement as we would ordinarily demand. If, 
however, we decide that we would be satisfied with an r that is 
no less than, say, .99, we could put this value into our formula 
on the left and the known average intercorrelation on the right, 
and determine the number of judges (a) required to give that 
r by solving the equation for a. 

Lengthening a test increases its validity as well as its reliability. 
If we measure validity in terms of the correlation of the test 
with an outside criterion and assume that there is one form of the 
criterion, formula (111) becomes 


Tas 


v (Predicted correlation be- 
cz tween a criterion and (116) 


Te(az) = AE + (a — a)ru the sum or Aai ofa 


forms of a test. 


AVERAGE INTERCORRELATION 


Average intercorrelations were called for in the formulas 
of the preceding section. A great deal of labor would be involved 
in computing these in the regular way, especially when the 
number of forms to be intercorrelated becomes considerable. 
Fortunately there is available a very simple method of computing 
an average intercorrelation if we may assume equal variabilities 
in.the arrays among which the average intercorrelations are sought. 

Let s be the sum of the corresponding items across all the ` 
arrays in the case of one individual. Then, if 1, 2,3,...,¢@ 
number the columns, s = 21 + £2 + 3+ ` * * +22. Squaring, 
summing for all the individuals (N), and dividing by their 
number, 

Bs? _ Dai + tz + tst ++ + +0)? 
Ne N 
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Squaring the polynomial and putting the summation sign and 
the N with each term, we have 


Erite Etts 


re i NE aN 


Ds? 2zi 223 


_ Ea 
Mik Wy ANEN, 


ETT; Ela—ıTa 


TEE a nents N 


The term on the left of the equation is the standard deviation 
squared of the sums of scores by individuals. The items of 
the first type on the right are expressions for the standard devia- 
tion squared of the column (i.e., is, by forms or judges). Since 
we are assuming equal variabilities among our arrays, we may 
call any one of them o3, meaning the standard deviation squared 
of an individual array. (An array is the set of scores assigned 
to the individuals by a single judge or achieved on a single form 
of the test.) There are obviously a of these o}’s, a being the 
number of arrays; since we have assumed them sufficiently 
similar to be treated as averages, their sum is acf. We shall 
also treat the product moments as averages. In order to make 
r’s out of them we must, of course, multiply each by gofos, SO 
that each will give us re? There are (a? — a) of these, as in 
the previous development. Using a symbol to indicate that 
these are averages and summing them as such, our equation 
becomes, 


Ces oad | gos (The variance of the sum of 
a, = aot + (a ajo asimilar correlated arrays) (117) 


Transposing and solving for ru, 


2 2/52 i is 

— ao? 2) — a (Average intercorrela 

Tir = = f= (i/o) tion among a arrays (118) 
o(a? — a) a — a of equal variability) 


The a is, of course, the number of arrays correlated. The o% 
is best found array by array, then these o”s averaged. Tf all 
the scores are thrown together into a single frequency table for 
the calculation of the c; the additional assumption must be made 
that the means of the arrays are equal. This is likely to be 
a more disturbing factor than the assumption of equal variabili- 
ties. For averaging the variances of the a different arrays, the 
following formula is convenient. Notice that, since the N is 
the number of items, it is the same for all the arrays. 
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Table XIII is a table of ratings made by 16 high-school pupils 
on 25 of their fellow pupils on the trait of social-mindedness. 
The ratings are on a scale of 1 to 5, with each stage defined by 
a description. The illustration is from the “Survey of On-coming 
Youth in Pennsylvania” conducted by Harlan Updegraff. 
At the foot of each column we give the “makings” of the standard 
deviation of the individual ratings and in the two columns at 
the right the “makings” of the standard deviation of the sums 
and also of the averages. The computations are given a few 
lines below. For 2X we add along the columns all the separate 
DX elements, and for DX? we similarly add the >X”s of the 
separate columns. For the sums, column DX, is necessarily 
the same as XXX, which is 1,359. For the sums column 2X; is 
77,763. The student may himself wish to solve the problem 
by way of averages. The computations by way of the sums 
follow: 

a _ NZ2X? — SEX} _ 25-5,265 — 117,597 _ 44 


ey aN? 16-25? 
o2? = 155.5 
2 t oG -— 
noada (155.5/1.4) WEN 
a—a 16° — 16 
ary TOs SOUN 


‘TE a@—Drr 1+ 15-396 


Usually when one is dealing with such data as those to which 
this section relates, especially when in the form of ratings, he 
will have in hand the averages for the individuals rated in addi- 
tion to the sums, or perhaps only the averages. Formula (118) 
may then take the shape, 

28/95 Riaya 
y= Cold =a oadd 1 AT 


r 


In this form it is possible to estimate roughly the reliability 
of the ratings from the range of the averages and an estimate 
of the variability of the individual ratings. In the survey from 
which the above illustration was taken, it was necessary to 
detect, and to hold out for further investigation, those rooms in 
which the ratings seemed to have been unreliably made. Because 
there were hundreds of rooms, actual calculation would have 
been far too slow. We observed that the individual ratings had, 
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usually, a standard deviation of about 1, and we took the stand- 
ard deviation of the averages to be about one-fourth of the range. 
We, therefore, looked over the averages from the 30 or more 
pupils in a room, recorded the highest and the lowest of these, 
divided this by four, and substituted in formula (119), or mentally 
made the substitution. Suppose, for example, the averages in a 
particular room ranged from 1.22 to 3.18, and there were 30 
judges contributing to the rating. We would have, as a rough 
estimate of the intercorrelation, 


RRS 30-1 op ec 


And the reliability of the average of the 30 ratings would be 


a ary (80)(21) _ gg 
T+ @— Dr 1+ (2921) ` 


Taa 


If one prefers, he may use the split-halves method for deter- 
mining reliability in such situations as we have been referring 
to in the above paragraphs. That is, one may break the set of 
raters into twọ approximately equal chance halves, get an array 
of sums or of averages from each of these halves, determine the 
coefficient of correlation between these halves, and apply for- 
mula (114). If the assumptions mentioned in this section have 
been fully met, the outcome will be identical by way-of the split- 
halves method with that obtained by way of the average-inter- 
correlation method, as the reader may wish to convince himself 
by a little exercise in algebraic manipulation. 

Average Intercorrelation from Ranks.—If our data are in the 
form of ranks, we can put formula (118) in a different shape, 
simpler for operative purposes, by taking advantage of the fact 
that the standard deviation of any set of ranks is known. We 
treated that matter on page 107. Since the o? is the standard 
deviation of a set of N ranks, o? = (N? — 1)/12. We need yet a 
more convenient value for o?. The average of a set of N ranks is 
the sum of the first and the last divided by 2; that is, (N + 1)/2. 
Hach array has this as its average rank score. Therefore all 
the a arrays together have as the sum of all their scores N - a 
times this average. That is, 
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IE nepal ese (N41 

28 = Na( 3 J ana 35 a( 3 ) 

Remembering the general formula for a standard deviation in 

terms of scores, o2 = (2X*/N) — (2X/N)}’, or (2X2/N) — M?, 
5 q2 2 2 

we have in the case of our data, «3 = 2a — eU The a 


has the same meaning and value as before. Substituting in 
formula (118) the equivalents just found for the two types of 
o’s, we have 

IS at(N +1)? a(N*? — 1) 

Nee 4 12 


oa 1) (a? pe a) 


cat ay 


Rearranging the order of terms and placing all the terms of the 
numerator over 12 as a common denominator, we have 


—a(N? — 1) _ 3a? (N + 1)° a 1225S? 
12 12 12N 


Cae 1) (a? etg a) 


Ti = 


Canceling certain terms and breaking the fraction into three 
component ones, we get 
Aas seed 3a(N +1) 1238? 
u=T—a W—-DG@e—D ° aNW?—1)@-1) 
By some rearrangement of terms this can be put into the form 
given by Kelley as follows: 
__a(4N +2) 1258? 
tu=1—-Goayw—t + a@—)W?— DN (120) 


(Average intercorrelation among arrays 
expressed in ranks) 


Although this formula is somewhat long, all the terms in it have 


a conventional meaning, and it is easily applied in practice. It 
involves no special assumptions. 


INTRACLASS CORRELATION 
In formulas (118) and (119) for average intercorrelation the 
columns were treated as possibly somewhat different from one 
another. Consequently, we took their contributions to g4 from 
deviations from their own several means. But suppose we could 
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be sure that these columns were alike, except for chance fluctua- 
tion. Suppose, for example, we had a number of trees repre- 
sented by the rows and measurements of samples of the leaves 
from each spread through the rows. That would place the 
different trees in place of the 25 pupils and the leaves in place 
of the 16 raters. Then, since the leaves would not have been, 
assigned to columns in any systematic manner, the columns would 
tend all to have the same means and the same variabilities. It 
would then make no appreciable difference whether we computed 
the o’s of the columns from their own means or from the grand 
mean. The same thing would be true if we had, say, measure- 
ments of the texture of the teeth of brothers spread through the 
rows and the families to which they belonged represented by 
the columns. In this situation we could use formula (119) with 
of computed from deviations from the grand mean. Such an 
average r between sets of members of which the pairs belong to 
the same class is called an intraclass correlation. It is evidently a 
special form of average intercorrelation in which sufficient con- 
fidence can be put in the assumption that the classes are alike 
(apart from chance fluctuation) that the deviations may be 
taken from the mean of the combined classes rather than from 
the means of the several classes. On this assumption Harris’s 
formula for intraclass correlation reduces to our formula (119). 
Of course, intraclass correlations can be computed directly from 
a correlation table by entering each pair twice, entering a mem- 
ber’s score on the y axis at one entry and on the x axis at the 
second entry, then computing the r from the combined table. 
The outcome is the same as if formula (119) were used, but 
the process becomes too laborious beyond several pairs of classes. 
The worker must take warning that the technique of calculating 
intraclass correlations is highly sensitive to the fulfillment of its 
assumptions; if the classes are not really alike, so that the 
assumption of equal means is not fulfilled, highly distorted results 
may follow.” 


1 Quoted by Fisher, Statistical Methods for Research Workers, 7th ed., 
p. 220. 

? Ibid., p. 235, for an example in which the intraclass correlation technique 
leads to the inference that the r is negative when observation of the table, 
or calculation of the average intercorrelation among the columns or among 
the rows shows that the average intercorrelation of the classes is markedly 
positive. 
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CORRECTING A COEFFICIENT OF CORRELATION 
FOR ATTENUATION 


Virtually always in practice when we compute a coefficient of 
correlation between two arrays, we are correlating measures of 
imperfect reliability. To the extent to which our measures are 
unreliable, our obtained correlation is too low; if there were no 
reliability at all in our measures of one or both of the traits, we 
would get a zero correlation no matter how highly the traits 
were really correlated intrinsically, i.e., if truly measured. We 
have a legitimate interest, therefore, in asking ourselves how much 
higher the true correlation is than the one we have obtained from 
our fallible measures. To find this correction is to “correct 
our r for attenuation.” A formula for this purpose is easily 
derived after the manner employed in the second section of this 
chapter. Since we want the “true” correlation, we must have 
the correlation of the average scores from an all-but-infinite 
number of forms of each of the tests with which we are dealing. 
Making our starting point formula (111) in the first section of 
this chapter, we must therefore make both a and b infinite. As 
a preliminary step to doing this, we shall divide both numerator ` 
and denominator of the fraction by ab, and get 


Fay 
Tafebsy = i A T 1 
vi + ( ig 3) nea + ( ie i) fiiy 


As a and b approach infinity, we shall have left 


Taswa = Fey ` (Formula for a coefficient of correlation (121) 
z [rst iy corrected for attenuation) 


The riz and the rım are the reliability coefficients of the two 
tests, respectively. The Fs is the average correlation between 
any number of samples of measures of the x and the y functions. 
But in practice we ordinarily have in hand only one pair of sam- 
ples, and we must take this as representing the 7 between the 
functions obtained with fallible measures. Therefore the r 
corrected for attenuation is merely the obtained r divided by 
the square root of the product of the reliability coefficients of the 
measures. This formula is applicable, of course, not only to 
tests of the objective, verbal type but to any sort of estimates or 
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other fallible measures. Sometimes this correction results in 
an r greater than 1.00. This is because the Fz obtained from our 
particular sample is not the one we would have obtained from the 
average of a considerable number of samples but is an abnormally 
high variant. It is the practice to write no corrected r’s as 
higher than 1.00 even though the correction gives mathematically 
a higher one. 


THE RELATION BETWEEN TRUE AND FALLIBLE SCORES 


This section merely extends the technique employed in the 
preceding one. For certain theoretical purposes we may wish 
to know certain relations between true and fallible scores (the 
latter being defined as the scores obtained from a relatively 
short instrument of measurement so that it yields results that 
vary somewhat from sample to sample and the former is defined 
as the scores yielded by an instrument applied an infinite number 
of times and the average taken so that the scores have been 
completely stabilized). We shall consider first the r between 
scores on a single sample of x measures and an infinite number 
of measurements of ay function. This will give us the r between. 
fallible scores and a true criterion. Making the general formula 
[Eq. (111)] our starting point, b is to be infinite and a is to equal 
1. Dividing numerator and denominator by b, we have 


CRA 


va ++ (a — ON + ( = i) Tiry 


Substituting 1 for a and infinity for b, we have 


Tad, = 


= fa 
| VI+ (= Dru VO + (= Oriy 


(Correlation between a fallible (122) 
score and a true criterion) 


Trite 


The rın is the reliability coefficient of the instrument that is to 
constitute the infallible criterion—computed from fallible samples 
of it which we have in hand. The Fz, is the obtained correlation 
between fallible measures of the two correlated’ functions. In 
order to get the r between true scores and a fallible criterion, we 
would need only to make the proper interchanges in the formula. 
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We shall next take the case of correlation between a single 
sample and the average of an infinite number of samples of the 
same function. Here we proceed in the same manner as in 
the preceding paragraph; but the Fey, the riz, and the Tı are 
the same sort of correlations (since the x series and the y series 
are similar measures of the same function), so that all the 7’s 
of the formula will be rı; Then 


ari 


Var = ru aly + (1-5) 


Tag bf = 


Substituting infinity for b and 1 for a, we have 
Tir Tir 


Ne => = = 
Vivi vru 


Reducing this by dividing the numerator by the denominator, 
we have 


(Index of reliability—the correlation 
Tie = V Tir between fallible and true scores of (123) 
the same function) 


Thus the correlation between a set of fallible scores and a set 
of true scores of the same function is the square root of the average 
intercorrelation among the samples, i.e., the square root of the 
reliability coefficient of the test. This r between obtained scores 
and hypothetical true scores of a function is called the index 
of reliability in contrast with the coefficient of reliability of the 
test. It is claimed by many persons that such index of reliability 
is a fairer formula in terms of which to state the reliability of a 
measure than is the coefficient of reliability. 

Can we estimate an individual’s true score from the fallible 
score we have in hand for him? Weshallsee. We have already 
learned how to estimate a score in a second series from a score in a 
first series, knowing the o’s of the two and the coefficient of cor- 
relation between them. We make our estimate by means of the 
regression equation in score form, for which the formula was 
given on page 111. It is Y= Tey(oy/o2)(X — Mz) + My. 
For us the Y is to be the true score, a, the standard deviation 
of a set of true scores, c+ the standard deviation of the sample we 
have in hand, and rzy is to be fie. Our formula then becomes 
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X= ro = (X — M.) + M3 


So far as we know M? is the same as M,. We, therefore, know 
everything required by our regression equation except c®. 
We shall now find that. Taking the square root of both sides 
of formula (117) we get 


os, = Va + (@ — a)ru 


This is the standard deviation of the sum of a sets of samples. 
We may get the standard deviation of the average of a samples 
by dividing the sum by a, remembering that we must make this 
a? when dividing under the radical sign. 


1 1 (Standard openan oi ua 
| average of a correlate 
Cea: a AR ( oa > Tu measurements of the same (124) 
G function) 

Now let a approach infinity. We shall then have, 


6, = V0 + (1 — Ori 


Oa = oi\/71, (Standard deviation of a set of true scores) (124a) 


We are now ready to substitute in our regression equation. 
X = SV Vri(X — Ms) + M, = ru(X — M.) + M: 
z 


By a rearrangement of terms this may be written in a form. that 
is more convenient for operative purposes as follows: 


(Formula for a true score in 


Za terms of a fallible score 
X = ruX + (1 — ru)M: and the reliability coeffi- (125) 


cient of the test) 
What is the standard error involved in estimating a true score 
from a fallible criterion? We can easily determine. Our 
general formula for the standard error of estimate is 


Costy = Ty 1 — 72, 


We wish now to have y become a true measure instead of a fallible 
one. There are two cases: the first is the one in which we 
want the standard error of estimate of a true score from a 
fallible score of the same function. Here y is to be replaced by 
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Za S0 that we shall have 

Fests, = Oza V Tear 
From our preceding developments we know that o,, = a2V Tu, 
that riy = ru, and that, therefore, T? = Tu. Making these 
substitutions, we get, 


Cots = or Tih T= Ti (Standard error of estimate of (126) 


a true score from a fallible 
Feats, = OrV Tu — ry score of the same function) 


We can make the converse approach and get the scatter of 
the fallible scores in hand around the true scores. This is called 
the standard error of measurement and is used considerably in 
interpreting reliability coefficients. Here x represents the true 
scores and y the fallible ones, and we have 


ee LEF (Standard error 
O(Mess.) = oyV1 — ri, = yV 1 — Ti of measurement) (127) 


P.E. is .6745c. Therefore, employing the expression P.E. moss 
for the probable error of a fallible score when estimated from a 
true criterion, 


PR pil Seri Vra Hn e Ra? Glatt 


where, again, rir is the reliability coefficient of the test and ø is 
the standard deviation of the sample in hand. 

The other case is where we wish to obtain the standard error 
of prediction of a true score in one function from a fallible score 
in another. The reader will be able to see, by proper substitu- 
tions in our basic formulas, that, when the x series and the y series 
are different, 


72, (Standard error of estimate 
Cesty, = Ty q| Tuy — Tin, of a true score in one func- (128) 
My tion from a fallible score in 
Costy, = TN STi — Ty another function) 


Here the ru, is the reliability coefficient of the y measure taken 
from the samples of it we have in hand, the oy is the standard 
deviation of the sample set of fallible measures, and the Tay is 
the coefficient of correlation between fallible measures of the 
z function and fallible measures of the y function which we have 
in hand from our pair of samples. 
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The reader should take warning that the true scores about 
which we have been talking in these sections, the 7’s between 
true scores, and the o’s of true scores are to be taken as only 
hypothetical entities, useful for theoretical purposes. One is 
not justified in taking the theoretical true score calculated for a 
pupil to be necessarily his correct score. It is only that the aver- 
age of the true scores made by individuals who earned the same 
score on a fallible measure as that attained by him would be the 
true score estimated for him. His own might diverge widely 
from the estimated one. Neither should we substitute true 
sigmas for obtained ones in practical computations or use the 
“r corrected for attenuation” in applied regression equations or 
in the multiple or partial correlation problems we shall treat in 
our next chapter. 


CORRECTING A COEFFICIENT OF CORRELATION 
FOR HETEROGENEITY 

It is well known that, other things being equal, the size of a 
coefficient of correlation is very much affected by the hetero- 
geneity of the population on which it is computed. Suppose we 
were to select 25 representative persons ranging from one year of 
age to twenty-four years and to compute a coefficient of correla- 
tion between their ages and weights. The r would be very high, 
perhaps .90 or .95. Suppose, now, we select twenty-five repre- 
sentative persons of ages approximately thirteen to fourteen 
years—say a random sample from the pupils of the eighth grade 
in school—and compute a coefficient of correlation between the 
ages and weights of this more homogeneous population. The 
r would be very low. The same contrast holds in other types of 
data. When the coefficient of reliability of a test is given, it is 
important to know through what range of talent the test was 
given from which the r was computed. The same is true of a 
coefficient of correlation between intelligence test scores and 
scores on an academic achievement test, or an r between measures 
of any other two functions. To be comparable, two r’s must 
have been computed from populations of the same degree of 
heterogeneity or there must be some method of correcting one 
of the 7’s so as to indicate what it would probably be if computed 
from the same type of population as that from which the one was 
derived with which it is being compared. ~ 


INFERRING COEFFICIENTS OF CORRELATION 209 


Kelley! has developed a formula for correcting an r for a range 
of talent different from that of another with which itis to be com- 
pared; in other words, for inferring what an r would be in one 
range of talent knowing its size in another range of talent. We 
shall employ a somewhat different approach from his but arrive 
at the same result. We take first the case of reliability. 

Let x be a score (in deviation form) in the narrow range and 
x, the corresponding true score. Then 

£ — Te = d; T? —2r,+%, = d? 
Ea?  22rt, | Dre 2d 
N nN ey 302 — Noral Tso + Tro = Oå 
Similarly, let X and X,, be paired deviation scores in the wide 


range. Then, by the same process, if = represent the standard 
deviation in the wide range and R the coefficient of correlation, 


32 — 9Rxx.Zx2x4 + Zhe = 2% 


But, if the test is equally as effective in the narrow range as it 
is in the wide one, the distribution of differences between fallible 
and corresponding true scores will be the same in the two ranges,” 
so that o3 will equal 2%. Therefore, 


o? — WsrgF2rm +O, = Dy — Wxrx UxUx_ + rab 


Substituting the values of rz:, and oz, from formulas (128) 
and (124a), we have 


o — W/ruowe/ ru + otru = De — W/RyDxDxV Ru + VR 
o? — 2ry0? + oru = Dt — 2Rurk + Deku 


re re 
(l — ru) = (l — Rau); $e Saas 


Ca M A/a Rir (Formula for correcting a reliability (129) 
Dx V1 opi coefficient for heterogeneity) 


From this formula it is easy to calculate either coefficient, know- 


1Kuurmy, T. L., Statistical Method, The Macmillan Company, 1923, pp. 
221-223. 

2 This is identical with Kelley’s assumption that the standard error of 
measurement is the same in both ranges. For the differences locate the 
items in the columns of the correlation surface, and, if the items that con- 
stitute the columns of the two surfaces are similarly placed, the variabilities 
of the columns under the two conditions will be the same, 
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ing the other and knowing the standard deviations in both the 
wide and the narrow range of talent. n 

This formula relates only to the case of reliability. The 
development will not work through in nearly so simple a manner 
in the case of inter-function correlation. In the latter case, 
assuming that the scatter (“variance”) of the distribution of 
true scores in the one function from their corresponding true 
scores in the other function is the same in the narrow range as 
it is in the broad one, we would have the following rather com- 
plicated formulas: 


o _ VE = CBIR). 4% - VRin Ru) 
yo ry — Caru) Ze Vrima). 


(Formula for correcting inter-function 
r’s for heterogeneity) (180) 


But these formulas involve the reliability coefficients of the 
measurements in both functions for both ranges. Often, if not 
usually, information regarding these will not be available. But 
we can do well enough with the much simpler formula developed 
below, which works with obtained scores rather than with true 
scores. 

Let us assume that the standard errors of estimate are the 
same in the narrow range as in the wide one (see pages 112 to 


113). That is, 
‘ ovi = 72, = 2v = R} 
Dividing through this equation by 3y y1 — 72,, we get 


hl (Approximate formula for 
similarly correcting inter-funection (131) 


r’s for heterogeneit; 
EA eas geneity) 


_ These formulas demand no information regarding the relia- 
bilities, and we shall shortly show that they work well enough 
in practice. 

Kelley’s formula for correcting a coefficient of correlation for 
heterogeneity has been severely challenged, Holzinger’ gives 


1 HorzincER, Kart J., Statistical Methods for Students ü 7 i 
and Company, 1928, p, 254. $ Haea aie: 
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an illustration in which an r of .01 is increased to an R of .75 by 
doubling the size of the standard deviation and expresses doubt 
whether such an extreme change from merely doubling the varia- 
bility could be reasonably expected. Odell (though not in criti- 
cism) gives an illustration in which he infers a negative correlation 
for a narrow range from a positive one for a wider range of talent. 
But both of these examples are hypothetical cases, and neither is 
a case likely to be ever encountered in practice. There are, 
in fact, limits to the extent to which variabilities may change by 
merely extending the range from which scores are drawn, which 
limitation neither Holzinger nor Odell seems to recognize. The 
variability of the whole y distribution cannot become Tess than 
that of a single column of the correlation chart (assuming homo- 
scedasticity), for the y distribution must be made up of scores 
summed across the columns. The standard deviation of a column 
is o,/1 — 72, If the R of the heterogeneous population is 
below .87, it is impossible for the ¢ of the homogeneous population 
to be as small as half the large sigma. The absurdities involved 
in certain hypothetical cases will be found to turn upon ignoring 
this limitation. 

However, in view of the challenge to Kelley’s formula, we 
subjected it to empirical test. R. 8. Hovis! secured evidence 
which indicates high validity for both formulas (129) and (181). 
One type of data he used was measurements of the relation of 
height and weight in children, based upon tables published in 
Biometrika from some Glasgow surveys. Ten different popula- 
tions were employed, each with ranges from four to eight years 
when’ massed into heterogeneous populations and with a range 
of a single year for the homogeneous populations. In the narrow 
ranges the populations ranged in number from 255 to 1,445 and 
averaged about 800. Since measures of height and weight have 
nearly perfect reliability, formula (130) would reduce to (181). 
In 34 trials R’s predicted by formula (131) missed the corre- 
sponding ones actually computed from the consolidated tables by 
an average of only .0189 when the algebraic signs of the devia- 
tions were disregarded and by only +.0048 when the signs were 
considered. But formula (129) gave just as good a prediction; 

1 Hovis, R. S., “An Evaluation and Comparison of Two Formulae for 
Correcting Coefficients of Correlation for Heterogeneity,’ master’s thesis, 
Pennsylvania State College, 1935. 
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here the average error was .0172 when signs were disregarded 
and .0014 when signs were considered. 

In a second study Hovis employed correlations Pewee parts 
I and II of the Otis Classification Test (general intelligence and 
academic achievement), the homogeneous populations having a 
single grade range and the heterogeneous one a six-grade range. 
In these measurements the reliabilities were lower but still good. 
Out of 40 trials with formula (131) the predicted R missed the 
computed one by an average of .0129 when algebraic signs were 
disregarded and by .0097 when signs were considered. Formula 
(129) gave average errors of .0115 and .0065, respectively. 
Thus in both these studies these formulas for correcting r’s for 
heterogeneity proved highly valid, and formula (129) was as good 
as formula (131), even though formula (129) does not theoretically 
apply to inter-function correlation. 


REMOVING THE SPURIOUS ELEMENT IN CORRELATION DUE 
TO OVERLAPPING 


A student of one of the writers was attempting to find the 
coefficient of correlation between college grades for a single year 
and those for other years. His data were in a form that gave 
the average number of grade points earned by each student 
during the junior year and the average up to the end of this same 
year. In the grading system in question quality points of 0, 1, 
2, or 3 are given in each course; an average of these points for 
the year, obtained by multiplying the number of points awarded 
for a course by the number of “hours” in the course, summing 
for all courses of the year, and dividing by the aggregate number 
of hours carried during the year; and a similar point average 
computed ‘‘up-to-date.”” Our student could not obtain the 
average of the two preceding years excluding the junior year 
without considerable extra work. When he correlated the junior 
year averages with those of the 3-year period including the junior 
year, his r was .92, which was obviously spuriously high. It was 
spuriously high on account of the fact that the array which was 
to be used as a criterion included the array which was to be 
correlated with it. How could he remove this spurious element 
due to overlapping and ascertain what would have been the r if 
the overlapping element could have been removed from the 
accumulated point averages before computing the coefficient of 
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correlation? We developed for this purpose a formula which 
applies to any sort of case where averages are employed and the 
criterion average includes the factor to be correlated with it. 
We shall use the following notation: 


a = accumulated point average for all three years 

y = point average 1 year less—to end of sophomore year 
z = point average for the junior year alone 

a = the number of years combined in «—in this case 3. 


For the sake of simplicity of development we shall take the a, 
y, and z as deviations from the means of their respective series. 
Tt can easily be shown that, if the means of the constituent 
arrays are all equal, the relations among the deviations will be 
the same as the corresponding relations among the scores; and a 
little later we shall show that the effects of inequalities among 
means, too, cancel out in this problem, so that we commit no 
error by developing our formula in terms of deviations rather 
than in terms of scores. 

(a — 1) years were involved in making up the point averages 
of y. Therefore, for any particular student, 


pylon ite 
a 


Clearing of fractions, 
az = y(a— 1) +2 
Multiplying through by z, 
azz = yz(a — 1) + 27 


Summing for all students in the problem and dividing by their 
number, 


Zrz  Dyz ze 
Gey ee Dh ay 
The product moments can be reduced to r’s, as done repeatedly 
in this book, by multiplying both numerators and denominators 
of the fractions by the two required o’s. Doing this, and using 
o? for Dz?/n, we have 


Or z0202 = Tyla — 1) +03 
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We want a value for ry. Transposing, and solving for this, we 
get 
do x%z2 — Cs 


(a — 1)oy 


Roughly, we might take the o’s to be equal, in which case our 
formula would reduce to ry: = (arzz — 1)/(a — 1). But, in view 
of our showing in formula (124), the standard deviation of an 
average of correlated arrays is less than that of one of the com- 
ponent arrays and less than that of the average of a smaller 
number of arrays; to take the o’s as equal would give us a pre- 
dicted r that would be somewhat too high. We shall do better, 
therefore, not to assume equality of o’s in the formula even 
though we have reason to believe that the variabilities of the 
point averages are about the same in each year. We know, as a 
by-product from the calculation of r+z, the oz and the o, We do 
not know oy, and to compute it directly would make us too much 
trouble, since we would need to make up a set of averages for the 
(a — 1) composite which is just what we are trying to avoid. 
But we can easily develop a formula for getting this standard 
deviation through elements we already know. In any one 
student’s case, — 


Tysys(a — 1) = arses — 025 Tye = (182) 


ax? — 2arz + 2? 
(Dae 


ax Aft 
Ur ae iy andy = 


Summing for all individuals and dividing by their number, 


ye il a? dz? baz , 22? 

N Ea ma + Fr | 
_ Coz Uraa + 0} 

o ue (a— 1)? 


1 5 
@=1) Vaos — 2arz020, + 03 


(Standard deviation of the average of (a — 1) arrays 
in terms of the average of a arrays os KARM (183) 


We may now substitute this in formula (132) and have 


(Coefficient He correlation 
fR between the averages 
nee — AEEA from (a — Varraysand (134) 
Vag = ars, + 02 array in terms of the 
az Al'az0 20a F 03 averages from a arrays 
and 1 array) 
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We shall now correct for overlapping the spuriously high 
correlation found by our student. His rzs was .92; os was .47; 
and o, was .53. a was 3. Substituting these values in our 
formula we get, 


us 3 > .47 - 92 — .53 fa 
/9- 472 — 2-3- .92 -47 - .53 + 58? 


This same general type of problem often confronts the investi- 
gator when he wishes to determine the relative validities of 
different tests by correlating the scores of each with the average 
of all the others, or when he wishes to learn which of several 
judges is best by ascertaining which judge’s estimates correlate 
most highly with the average of all the others. It is much 
bother to make up each time a new average, omitting a different 
test or a different judge in turn. In order to save this labor, the 
same average of all is ordinarily kept for all the correlations, and 
each test or set of estimates is in turn correlated with this com- 
posite. But the r’s thus obtained are spuriously high, because 
of the element of overlapping due to the inclusion in the total 
of the scores to be correlated with it. We can remove this 
spurious element by the use of the formula developed above. 

But it will prove far easier to work with sums of scores than 
with averages, since an additional operation is required to 
reduce a sum to an average. A formula for sums instead of 
averages can be developed along the same lines, but more simply. 
We shall develop it in terms of scores rather than in terms of 
deviations, so as to fulfill our promise to show that inequality of 
means among the constituent arrays does not affect the formula. 


811 


Tye 


X = the score for an individual in the sum of all the arrays 

Y = the score of an indiyidual in the sum of (a — 1) arrays, 
the excluded one being the Z array 

Z = the score of an individual in the one array which we wish 
to correlate with the Y sum 

a = the whole number of arrays included in X 


Then, for any one individual, 
X=Y+2Z 
Multiplying through by Z, 
XZ=YZ+2 
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Summing for all individuals and dividing by their number, 
(2XZ/N) = (2YZ/N) + (22Z?/N). We must now subtract 
from each term the correction required to make it conform to the 
formula for an r or for a o when taken in terms of scores rather 
than in terms of deviations. We may legitimately do this 
provided we compensatingly add these same terms. 


(7 Wy Mele) PEEN EA LANE ial), 


Now; 0:0; k Noyo: Oye sf 
+ (ZË - an) - M.M, + M,M, + M 


Tes = Tuyas + 02 — MM: + M,M. + M? 
transposing and solving for fyz, 
_ Tests — 0 + (MM, — MyM. — M2) 
Oz 
Before proceeding further we shall show that the M’s in the 
parentheses aggregate zero. Since for each individual 
Y=X-4Z, 
ZY = 2X — YZ, and therefore M, = M, — M.. Substituting 
this equivalent for the M, in the parentheses, we get 
M.M, — (M.M, — M?) — M} = M.M, — M.M, 
+ Mi — MZ =0. 
In view of the fact that the M’s aggregate zero and cancel out, we 
have, upon canceling the o, from numerator and denominator, 
Tys = (Tes — 02)/oy. 
We must now find a value for o, in terms of X and Z, as in our 
previous development. 


‘ye 


(RY? 2X? oar , 27? 
SIN: N N N 
Adding and compensatingly subtracting the necessary quantities 


to make our terms g’s or r’s when the items are taken as scores 
rather than as deviations, 


ZY? yp) = (28° _ :) (ae M.M: 
(a — aa) = (F -10) - (R2 - M) or 


Y=X-Z 
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o2 = o2 + 02 — raso: — (Mj + 2M.M, — M? — M3). 
Before proceeding further, we shall show that the M’s in the 
parentheses aggregate zero. My = M, — M.. Substituting this 
in the parentheses, 
(M2 — 2M.M, + M? + 2M.M, — M? — M?) = 
We have left, therefore, as the value of oy, 
Tarara Aoprnon: of the 
a= Vato aa tame of men ia (135) 
arrays and of 1 array) 
We may now substitute this value of øy in the formula in which 
it occurs above and get as our completed formula 


(Coefficient of correlation between 
Trz — Oz thesumsfrom (a — 1) arrays and (136) 
Vo + of — 2rz020, 1 array in terms of the sums from 
= = zaa" a arrays and 1 array) 


Tys = 


In the case of averages, with which this section opened, we 
would have experienced the same behavior as between scores 
and deviations that we did in the case of summed series; t.e., 
inequalities of means in the individual arrays would have 
canceled out leaving us the same formula when operating with 
items in score form as when operating in deviation form. None 
of the formulas involve any special assumptions. They can be 
„counted upon to give the same 7’s as would have been obtained 
“by separating the arrays before computing the coefficient of 
correlation and are very convenient methods of correcting r’s 
for overlapping. 


SPURIOUS INDEX CORRELATION 
Another condition under which coefficients of correlation may 
be spuriously high is when each of the paired items is divided 
by a factor that is correlated with them. Thus, when we cor- 
relate 1Q’s and EQ’s we have 


(& A. E.A. 
GA. G.A. 
Here both mental age and educational age are divided by 
chronological age. If C.A. were a constant the division would, 
of course, have no effect upon the correlation. But it is not a 
constant; it varies from pair to pair in a way that involves cor- 
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relation with the numerator of its fraction. This involves a 
community between the two arrays that appreciably affects the 
cotrelation. The remedy here is to compute the coefficient of 
correlation between M.A. and E.A. with C.A. held constant by 
the technique of partial correlation. This will be discussed in 
our next chapter. 

A corresponding thing happens when AQ’s and IQ’s are 
correlated. Here the common factor, mental age, enters 
as numerator in one of the variables and denominator in the 
other. The effect is characteristically a negative correlation 
between intelligence quotient and educational quotient (see 
reference to Douglass and Huffaker at end of this chapter). 


Exercises 


1, Have the members of the class in which you are participating, or have 
at least three teachers, rate specimens of handwriting, or of sewing, or of art 
on one of the scales available for such purpose. Determine the reliability 
of the ratings. Calculate how many judges would be needed to give a 
reliability coefficient of .97. 

2. Have the members of the class estimate the weights of one another and 
determine both the reliability and the validity of the estimates. 

8. In a similar spirit have persons give character ratings on classmates or 
on fraternity brothers, or have teachers rate their pupils on character traits, 


and ascertain the reliability of the ratings. Try different techniques for ` 


making these ratings specific and objective and ascertain effect upon 
reliability. 

4, In a particular situation the r between history scores and geography 
scores is found to be .78. The history test has a reliability coefficient of .87 
and the geography test a reliability coefficient of .91. Correct the r between 
history and geography for attenuation. 

5. In a range of eight grades a certain intelligence test has a reliability 
coefficient of .98. What should it be expected to be for a single grade if the 
standard deviation for the eight grades combined is 32 and that for the single 
grade in question is 23? 

6. In Table IV, pages 58 to 61, compute the r between scores on 
English literature and total English scores (which include those on litera- 
ture). By means of formula (136) determine what the r should be between 
scores on literature and scores on the remainder of the test excluding litera- 
ture; 7.¢., remove the spurious effect of the overlapping. Finally, actually 
subtract the literature scores from the total, compute the correlation, and see 
how your r from the actual net scores compares with the inferred one. 
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CHAPTER VIII 
PARTIAL AND MULTIPLE CORRELATION 


The title to this chapter is likely to lead the reader to fear 
that he will find treated at this point a very complicated and a 
very mysterious topic. If he is already acquainted with the 
treatment of this matter in elementary texts in statistics, his 
experience there is likely to have confirmed this impression, 
especially in view of the rather forbidding-looking formulas that 
enter into the technique. But, in fact, partial regression and 
partial and multiple correlation are very simple in principle 
and parallel at every point simple regression and simple cor- 
relation as treated in Chap. IV. The reader should observe 
this parallelism as he progresses through the chapter. 


NATURE AND USE OF THE MULTIPLE REGRESSION EQUATION 


We may best approach the problem of the nature and use of 
the multiple regression equation through some illustrations. 
An agriculturalist wishes to predict from conditions that obtain 
up to the end of May what will most likely be the number of 
bushels of wheat produced per acre in July. This yield will be 
influenced by several known factors, such as, (1) the aggregate 
number of inches of rainfall through April and May, (2) the 
number of days of sunshine through these months, (3) the average 
temperature through these months. But these are not of equal 
importance as factors. In making his estimate, he must multiply 
each by an index number that will most closely accord with its 
relative degree of importance in affecting the yield of wheat. 
These several indices are the regression coefficients. 

Again, we wish to predict a student’s grade in school from 
several factors known in advance. These factors are such as 
the following: (1) his intelligence score; (2) time spent in study; 
(3) health; (4) the socioeconomic status of the home in which 
he lives. If we wish to predict his scholarship most closely, we 
may not give equal consideration to each of these factors, but 
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must multiply the scores on each by a coefficient growing out of 
its relative importance as a factor in producing the result in 
which we are interested. A critical problem becomes for us, 
then, the problem of ascertaining what are the relative degrees 
of importance with which the several components enter in the 
determination of the criterion; i.e., finding the several regression 
coefficients. Or we may merely wish to know the relative weights 
of the factors as an evidence of their relative importance. Many 
such problems confront the educational and social research 
worker, such as: To what extent do stature, intelligence, quick- 
ness of decision, and breadth of scholarship contribute to leader- 
ship? In what relative degrees do hours spent in formal drill, in 
browsing, in listening to lectures, in going to movies contribute 
to one’s knowledge. of history? And countless others. These 
relative weights are found by essentially the same procedure as 
the coefficients which are mentioned above. Or, perhaps, we 
may wish to put the matter in terms of coefficients of correlation 
because these are more familiar than regression coefficients. We 
shall then wish to find the extent of correlation between each 
of our several causative factors and our criterion when the influ- 
ence of the other factors is ruled out. These correlations which 
express what may be expected to be the relation between one of 
a team of factors and a criterion when the influence of the other 
members of the team is held’ constant, we call the coefficients of 
partial correlation. We shall see later that they are very closely 
akin to the regression coefficients and that they may be derived 
by essentially the same machinery. Perhaps we may wish to 
know what is the maximum accuracy with which we could predict 
a criterion by combining a number of predictive factors each 
with its “best weight”; how high a correlation, t.e., we could get 
in the case of our illustration between scholarship and our four 
contributing factors listed above taken jointly if we combined 
these in the best possible proportion. This maximum correlation 
that may be expected from combining a team of factors is called 
the coefficient of multiple correlation. It may be derived directly 
from the regression coefficients. When, too, the technique of 
computing partial correlations can be made sufficiently simple to 
permit its use by the rank and file of research workers, we shall 
find it a feasible substitute for parallel-group experimentation 
where we cannot control certain disturbing variables in our 
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experiment but can rule out their influence upon our findings by 
the partial correlation technique. It is clear, therefore, that, 
if we can get hold of the secret of finding partial regression 
coefficients, we shall have at our command an extremely useful 
tool in all our research. 


DERIVATION OF THE FUNDAMENTAL “NORMAL EQUATIONS” 


Our problem, you remember, is to find multipliers for a team 
of scores that will give each of the scores best weight in predicting 
(or producing) a criterion score. Let us put this in the form of 
an equation. Let us suppose that zo is the score of a particular 
student in scholarship, (taken as a deviation from the mean of 
the scholarship scores rather than as a raw score), and 21, £2, Xs, 
and z, are the same student’s scores on intelligence, socioeconomic 
status, health, and attendance, respectively. Then 
(A) Xo = bızı + dave + bats + baxa 
where the b’s are the coefficients by which we must multiply the 
several scores. We do not, of course, know the values of these 
b’s; their values are just the very things we are secking. But it 
is one of the beauties of algebra that it permits us to play with 
quantities, even if we do not yet know their numerical values; 
we merely designate them by letters and handle them as such until 
we can reach the point where we shall have found values for 
them. We know that there is some value by which we must 
multiply xı in order to give it its best weight, and also some value 
for each of the other coefficients, and we merely set b’s for these 
values then proceed to search for the numerical equivalents for 
the b’s as our regression coefficients, 

But we have probably measured intelligence, socioeconomic 
status, health, and attendance in terms of different units, so that 
our 2’s in the different series are of unlike meaning. To carry 
them along in this way will cause unnecessary cumbersomeness. 
Let us, therefore, reduce all of our scores to “standard measures” 
which have everywhere the same meaning; we can easily return 
to ordinary scores when we wish. To get standard measures we 
merely divide each deviation score by the standard deviation of 
the array to which it belongs. We shall let z’s stand for the scores 
in standard measures. Then 


To tı Ta 
% = —) By = —) 22 = —) ete. <i 
To 1 Tz eai i 
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We shall also use 6’s for the regression coefficients with standard 
measures instead of small b’s. Our equation then becomes 


Zo = Bitzi + Bote + Bes + Buta 


In order to generalize our equation, we merely extend it out 
toward the right to any number of factors. 


(B) Zo = Bier + Bote + Bza t'ii + Ban 


Now, if only we could apply some algebraic procedure to 
this equation, we might find the values of our several 6’s. But 
we are blocked by the fact that we have only one equation and 
in it a number of unknowns—all the f’s. We are helpless in 
this sort of situation until we have as many independent equa- 
tions as unknown quantities. Is there any way out? 

Yes, there is a way out, a way of getting as many equations 
as we need, but it involves a little calculus. 

We have drawn a little horizontal line above the zo in the 
preceding equation. That is to indicate that it would be the 
value computed (estimated from the combination on the right). 
Each score thus estimated from the right side of the equation, 
when the f’s are so determined as to give the best prediction on 
the average, would miss the corresponding student’s actual score 
a little. The error in any one case would be zo — 20. If we 
substitute for Zo, its value from Eq. (B), we have! ‘ 


zo — Zo = Zo — (Biz + Bote + Bazs + Bitt: + Bren) 


Now it is a principle of mathematics that for best fit the sum 
of the squares of the errors should be a minimum. We shall, 
therefore, square both sides of our equation and indicate by the 
summation sign that we have passed from considering a single 
score to the consideration of all the scores, because they all obey 
the same laws. Indicating the square on the left and squaring 
out the term on the right, we have 


D(a — Bo)? = BC} + Bit + Bled + BE + ++ + + Bree 
‘ — 2zo2ıbı — 2zo%B2 — zoz — `` — 2z02nBn 
+ 221208182 + 221238183 + PRN: + °° + 2212nBiBn 
F QeezaBoBs + 2eo2sBeBa + 2eaesBoBs-+ © °° + Reana H °° * ) 


1The procedure here is identical in character with that involved in the 
Pearson product-moment formula, p. 95. 
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Now, as was stated above, we want the values of the 6’s to be 
such that the right-hand member of our equation will be a mini- 
mum, We must, therefore, differentiate it and set its derivative 
equal to zero. But, since our equation contains a number of 
independent variables (the several 8’s), we must resort to partial 
differentiation, i.e., we must differentiate separately for each of 
the variables in turn. Differentiating first with respect to bı 
(and remembering that every term will drop out of our derivative 
that does not contain a £1), we have 


Z(QBrerz, — 22021 + 2b22122 + W2Wseizs + Wyeizsa + + * 
+ 2Bnzizn) = 0 


We shall now place the summation sign with each of the ele- 
ments within the parentheses, which is a legitimate way of sum- 
mating a complex quantity, and also divide through our equation 
by 2N. Then we have 

Dee.  Zzozı Lees EziZ3 


Bye — a + Ba 5 + p Ta y g ag. 


2zizn 


a E 


But remember that zı equals 21/1, 22 equals x2/o2, ete. There- 
fore Zzz2/N equals Yxx2/Now2. But this, it will be observed, 
is the formula for the Pearson r. For all such quantities in our 
equation we may, therefore, substitute r’s with the proper sub- 
scripts. We shall then have (remembering that ri represents 
perfect self-correlation, which is equal to unity) 


Bi — ror + Baria + Baris + Baria + + + + + Barin = 0 
We must next differentiate in the same way for each of the other 
6's, The result will be n equations similar in symmetry with the 


one above. ‘Transposing the terms preceded by the minus sign, 
we have the following as our set of “normal equations”: 


Tor = Bi + Borie + Baria + Baris + Boris + Berio + + * + + Borin 
Toz = Biriz + Bo + Bares + Bares + Boras + Borage + + + + H Baran 
Tos = Biris + Baras + Bs + Barsa + Boras + Borso + + + * + Barn 
Toa = Bursa + Bares + Bursa + Ba + Boras + Boras + + * + + Batan 
Ton = Birin + Botan + Boron + Barin + Borin + Boron + ++ + Ba 


Our concern from the first was to find values for our B’s. We 
have now found our way of doing so. We have as many equa- 
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tions in our set as we have independent variables, It is, there- 
fore, in principle, a very simple matter to solve these equations 
and find the values of our 6’s. All that there is to any special 
method of computing regression coefficients is some method of 
simplifying the solution of a set of simultaneous equations 
Practically every reader has solved such problems in high-school 
algebra, and you might be interested in trying on this set of 
equations the methods you have learned there, substituting 
numerical values, of course, for the r’s. But you will find the 
job extremely complex if you use more than two or three varia- 
bles. The labor mounts rapidly with an increase in the number 
of equations and becomes enormous after four or five. 


THE MOST ECONOMICAL METHODS FOR COMPUTING 
REGRESSION COEFFICIENTS 

The principal trick in working with regression equations is to 
command some economical method of solving these equations. 
In the texts on statistics the method customarily explained 
involves doing in turn a series of partial sigmas and partial r’s, 
each time reducing the partials to a lower order by one. This is 
the method developed by Yule, which we may call the partial 
correlations method. But this process gets immensely complicated 
beyond three or four variables and also involves working with 
what for the layman are magical procedures the meaning of 
which he does not grasp. There is needed a simpler and more 
meaningful method. 

Various persons have devised schemes for reducing the labor 
involved in solving such sets of simultaneous equations. As 
early as 1855 Gauss developed a method of shortening the proc- 
ess by successive trials and approximations. The determinants 
treated in texts in college algebra afford a convenient method for 
solving simultaneous equations for those who are familiar with 
them. Within the past ten years Truman Kelley developed an 
indirect method for attacking this particular problem of regres- 
sion coefficients which he called first the approximation method 
and later, in a further developed form, the iteration method. But 
these approximation methods are also difficult to learn and very 
baffling to laymen. Peters and Wykes' set forth a completed- 


1 Pormrs and Wyxes, “Completed Determinants Method,” J. Educ. Res., 
Vol. 24, pp. 44-52, 
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determinants method making available to laymen the method 
of determinants in a form that makes no demand that the 
worker know the algebra of determinants. 

The Doolittle method was developed by M. H. Doolittle, of 
the U.S. Coast and Geodetic Survey, about 1857. Up to this 
time but little attention has béen given to it in texts on educa- 
tional statistics, although such books as those of Mills and of 
Ezekiel have presented it. For any considerable number of 
variables it is by far the best method available. We shall turn 
now to an explanation of it and to certain work sheets embodying 
it. The Doolittle method takes advantage of the fact that our 
set of equations is symmetrical to multiply each, as it is used, 
by such factor that, when the equations are added, certain terms 
at the left aggregate zero and are thus eliminated. In this way 
the number of terms is rapidly reduced to a single unknown, 
which can then be directly calculated. Thereafter the process 
must be reversed and substitutions progressively made until 
all of the unknowns have been found. In the following work 
sheets, set up by Mrs. Wykes, the procedure is indicated step 
by step. Our work sheet extends to ten variables, but beyond 
that number the student can make his own formulas by induction 
from the steps used up to tenvariables. Evidence given in the 
series of articles, of which the one just referred to is the second, 
shows that the completed-determinants method is the most 
economical one up to four variables and that beyond that point 
the Doolittle method is by far the most economical. ' 


WORK SHEETS FOR THE DOOLITTLE METHOD 


The directions on the accompanying sheet are so explicit 
that no difficulty should be encountered in understanding them. 
An illustration will be given after the work sheet has been given 
and explained. It is to be noted that each row (line) has a 
number and each column a, designating letter. These numbers 
and letters are used in the directions. 

The 1 found in the first row and column of each new section is 
considered in all calculations. It has been placed there perma- 
nently because that is always the value of the item in that place. 

The work sheet is prepared for calculations up to a ten- 
variable problem. For any larger number of variables the reader 
can extend the work sheet by induction from the steps so far 
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Work SHEET ror THE DooLrrrLe METHOD 


irections 


B 
1 [e ris ri T9 o1 | ZA to I 


“3 Insert values for 7's r2a| r24 |r2s|rso| ro |r |r2| —roz| 2B tol 
4 Multiply items in line 1, B to I, by Bs 
5 Add algebraically lines 3 and 4 
6 Divide line 5 by negative Bs 
7 Insert values for r's T | ras] ras] rso|raz|rss|rao| —ros| SC to I 
8 Multiply items in line 1, C to I, by Cs 
9 Multiply items in line 5, C to I, by Cs 

10 Add plgebralcally lines 7, 8, 9 | 

11 Divide line 10 by negative Cio 

12 Insert values for r's T |ras| tas] rar) ras] rao] roa] 2D toT 

13 Multiply itemsin line 1, D to I, by D: 

14 Multiplyitemsin line 5, D to I, by Ds 

15 Multiplyitems in line 10, D to Z, by Du 


16 Add Blecbreicatl lines 12, 13, 14, 15 
17 Divide line 16 by negative Dis 

TS Insert values for r's T rss| ro |m |ro | ~ros | SE tol 
19 Multiplyitemsin line 1, E to J, by Es 
20 Multiplyitemsin line 5, Æ to I, by Es 
21 Mùltiplyitemsin line 10, Æ to I, by En 
22 Multiply items in line 16, Æ to I, by Fir BES ata ka A Gli 
23 Add algebraically lines 18, 19, 20, 21, 22 


24 Divide line 23 by negative Hos 

35 Insert values for r's T | rez|rex| ro] —roe| ZF tol 
26 Multiply items in line 1, F to I, by Fa 

27 Multiply itemsin line 5, F to I, by F's 
28 Multiply itemsinline 10, F to 1, by Fu 
29 Multiply itemsin line 16, F to I, by Fir 
30 Multiply items in line 23, F to I, by Fa« NE i: Ir WE 


36 Multiply itemsin line 10, G to 
37 Multiply items in line 16, Œ to 
38 Multiply items inline 23, G to I, by Gas 
39 Multiply items inline 31, G to I, by G33 
40 Add algebraically lines 33, 34, 35, 36, 37, 38, 39 
41 Divide line 40 by negative Gio 


49 Multiply items in line 40, H to I, by Ha Erp Tea 


Multipl 
59 Multipl 


mbols in following equations (6’s for each 
ions for the regression 


Substitute yalues from above table for s: 
equation found when each variablein turn is solved for) and solve equati 


coefficients, s AB, e s » Pie 
pemanis, B1, B2, Bay 


7 Har + ( Neat PoE E das y +1 
Hie ee 
Bee BOTT Ope + OOF + BiB + Bou + In 
; 


‘Bs 
= 2 F Es + (BO Cet Is 
B= (Ende t EE T Rae T Go of (Boks t Oops t (690: t GBs + D 
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presented. For anything less than a ten-variable problem, 
columns not needed should be cut off (marked out) beginning 
with the J column. Thus, work would be continued only through 
the H column for a nine-variable problem, through the G column 
for an eight variable, and soon. (The criterion is always counted 
as one of the variables.) In all cases, no matter what the number 
of variables, the J column is used. 

The lower sections also disappear in toto with a smaller number 
of variables than ten. They will disappear below that section 
where the column with your highest-numbered r subscript 
terminates inal. But this will not bother you; it will take care 
of itself. 

In the back solution, provided for at the foot of the work 
sheet, rows and columns also disappear with a smaller number 
of variables. Beginning at the top of the set of back-solution 
equations, draw a line through each on the left side of the equa- 
tions that has a subscript for which you have no corresponding 
r subscript. Extend this line all the way across, thus striking 
out the whole equation constituting that line. Then strike 
out all the columns that have had their respective J’s eliminated 
with the eliminated rows. The remaining elements are all 
that are needed in the back solution, discussed in our next 
paragraph. 

You need now only find the values of the several 6’s in these 
back-solution equations at the foot of the work sheet. They are 
the regression coefficients you are seeking, each corresponding 
with a similarly numbered element in your r’s. The value of 
the topmost one is indicated directly in your equation; the others 
are obtained by successive substitutions in the equations that 
follow. 

The X column in the work sheet is the check for correctness. 
Its use is optional, but, if it is not used, the problem should be 
done simultaneously by two different workers and their results 
compared from time to time, or should be done by the same 
person twice on different work sheets, perfect consistency 
being required. 

The check is to be used as follows: the capital sigma is the 
summation sign, meaning that you should add together all the 
items in the line between the limits named, including the limits. 
The X column is to be included each time you are told to multiply, 
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add, or divide lines. But carefully note this exception: before 
multiplying any summated item in the X column by the factor 
used as a multiplier throughout the line, subtract from this 
summation all the items in that line lying to the left of the column 
with which the multiplications in that line started. For example, 
in line 14 you are told to “multiply items in line 5, D to I, by Ds.” 
When you extend this multiplication into the X column you 
must first subtract from the value standing at X; the values at Cs 
and B; before multiplying the remainder by Ds. It amounts 
to the same thing to sum the line only from the column in which 
the multiplier is found. A corresponding thing must be done at 
lines 4, 8, 9, 13, and every other place where such a multiplication 
is called for.* 

Before proceeding further, we shall do an example, since the 
directions for using the work sheets seem to be hard to follow 
before concrete illustration and easy to follow thereafter. We 
shall use a study by Miss Dessa E. Gresser on “The Factors 
Conditioning Comprehension of Literature in the Senior High 
School.” One finds some pupils who have difficulty in compre- 
hending the literature they read. One feels at a loss to know 
what to do for them until one knows what are the factors that 
cause the failure to comprehend. Miss Gresser undertook 
to ascertain what some of these factors are and with what weight 
they severally contribute toward the ability to comprehend. 
It is to this sort of problem that the partial regression technique 
lends itself as a tool. 

The factors considered by Miss Gresser were 


1. Speed of reading. 

2. Knowledge of grammar. 

8. Range of general information. 
4. Knowledge of vocabulary. 


Each of these abilities was measured by suitable tests and scores 
on each obtained for each pupil used in the research. The 
criterion, ability to comprehend literature, was similarly meas- 


1 The spacing in the work sheet on p. 227 is too small to permit operations 
on the page unless with a very sharp pencil. For a work sheet with wider 
spacing the worker should copy this one on a larger scale or send to the 
authors for copies. 

2 A master’s thesis at the Pennsylvania State College 1932. Abstracted 
in Penn. State Studies in Educ. No. 8. 
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ured by the Stanford Test of Comprehension of Literature. 
The following zero-order correlations were obtained among these 
four factors and the criterion, for the last of which we use the 
subscript 0 and for the others the numbers indicated above. 


Tor = .334 Tig = .370 T23 = .642 T34 = .735 
Tor = -416 Tis = .396 T24 = .078 

Tos = .653 ria = .567 

Yo, = .691 


We must now insert these values for the several 7’s in the 
proper places in the work sheet and do what the work sheet 
says. The reader should carefully verify all the steps, then use 
this exampleasamodel. In using the work sheet, (opposite page) 
we omit all parts we do not need for a ptoblem of this length. 

The results we get at the foot of the work sheet are the partial 
regression. coefficients. What do they mean? They serve two 
sorts of purposes. One of these is to show us the relative weights 
of the four factors in contributing to ability to comprehend 
literature. These weights are speed of reading, —.0691; knowl- 
edge of grammar, —.0857; range of information, +.3530; and 
knowledge of vocabulary, +.5203. The first two are substan- 
tially equal to zero, so that we may say grammar (as measured 
by the Kirby test) and ability to read rapidly do not contribute 
significantly to ability to comprehend literature when these are 
separated from the force they get by overlapping general infor- 
mation and knowledge of vocabulary. The other two contribute 
heavily, vocabulary making a larger net contribution than general 
information. These weights are not to be confused with per- 
“centages; ; they cannot be expected to add up to +1. They are 
the slopes of the respective regression lines relating the criterion 
with each of the factors in turn, when the influence of the other 
factors involved in the problem (but not additional disturbing 
factors) are held constant and provided the factors are measured 
in units of equal variability. 

The second use is prediction of scores in the criterion from a 
knowledge of scores in each of the contributing factors. If we 
take our measures in z scores, we can predict the score a student 
is most likely to make in comprehension from his scores in the 
four other tests by the regression equation 


Zo = —.0691z, — .085722 + :3530z; + .5208z4 
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| But it is only in those cases where we are dealing with scores 
that have the same variability in all distributions—like z scores, 


Doourrie Work SHEET 
Miss Gresser’s Data—Arranged for Obtaining Bos.t23 


Directions A B Cc D I x 
1 Insert values for 1) r1o.870| 719.896] 14.567 |—ro1—.334]+-1.9990 
rs 
2 Divideline 1 by —1)—1} —.370) —.396) —.567 +.334 
3 Insert values for r’s |1 123.642) 124.578 |—roa—-416|+-1.8040 
4 Multiply items in line : 
1, B to I, by Ba — . 1369| — . 1465| —.2098 +.1236| —.3696 
5 Add algebraically lines}. 
3 and 4 +.8631|/+.4955| +.3682 — 2924) +1 .4344 
6 Divide line 5 by nega-| 
tive Bs =1 —.5741| —.4266 +.3388|— 1.6619 
7 Insert values for r’s 1 134.735 |—ros— -653| +1.0820 
8 Multiply items in line 1, C to Z, 
by C2 —.1568| —.2245 +.1323| —.2490 
9 Multiply items in line 5, C to I, 
by Cs — .2845] —.2114 +.1679| —.8280 
10 Add algebraically lines 7, 8, 
and 9 +.5587| +.2991 —.3528| +.5050 
11 Divide line 10 by negative Cio |—1 = 5358 +.6315| —.9039 
12 Insert values for 7’s 1 —ro— 691} 4.3090 
13 Multiply items in line 1, D to Z, by Ds) — 8215 +.1894) —.1321 
14 Multiply items in line 5, D to I, by Dy) —.1571 +.1247| — .0323 
15 Multiply items in line 10, D to J, by D| —.1601 +.1889| + .0287 
16 Add algebraically lines 12, 13, 14, and 15 + .3613 —.1880| +.1733 
17 Divide line 16 by negative Dis -1 +.5203| —.4797 


Substitute values from above table for symbols in the following equations 
(6’s for each equation found when each variable in turn is solved for) and 
solve equations for the regression coefficients, 61, Bs, Bs, and Bs 


Ba = In = +.5203 
Bs = (Bs)(Du) + In = (.5203)(—.5353) + -6315 = +.3530 


Ba = (Ba) (Do) + (Bs)(Cs) + Le = (.5203)( —.4266) + (.8530)(—.5741) 
+ .3388 = —.0857 


Bi = (Bs)(Ds) + (Bs)(C2) + (B2)(B2) +12 = —.0691 


T scores, ranks, or percentiles—that our predicting equation 
remains so simple. If we have scores of unequal variabilities, we 
must take account of the sigmas. If our measures are 1n terms 
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of deviations from the means of their respective arrays, our 
regression equation would become 


i = a0 (- .0691 =! — .0857 = + .3530 Z + 5203 z) 
OL 02 T3 04 
If we wish to put this in terms of raw scores instead of devia- 
tions, we must substitute (X, — M) for zı, (X2 — Mə) for 2, 
etc. We shall then have, as our regression equation, 
X, 


1 


Koda (- 0691 %! — 0857 X2 -+ 3530 X# + .5203 xs) re 
02 O3 O4 


where K is given by the following expression (which may be 
calculated once for all the scores): 


Mı 


o1 


K= a(- -0691 


T4 


— .0857 x 2+ .3530 x 24 5903 A ‘) — My 
2 3 


The Xs in this equation is the score predicted for an individual 
from the team at the right. Let us take the following hypo- 
thetical data from Miss Gresser’s study. A boy made the 
following scores: 


1. Speed of reading, 100. 
2. Grammar, 30. 

3. General information, 90. 
4, Vocabulary, 80. 


What score may he be expected to make in comprehension of 
literature? We shall take the means and the standard deviations 
of our several factors to be as follows: 


Mean | Standard Deviation 


Wr READING Ey a EE tices 96 30 
2eGranmnat e eger cr. seh L 35 6 
8. Information................ 96 20 
BA Vocabulary rrano taksen 92 30 


0. Comprehension.............. 63 22 


Putting these values in our score-form regression equation, we 
predict for this boy: 


Xo = 22(—.069138 — 085792 + 35303% + .5203$%) 
— 22(—.069135 — .08573# + .35303$ + 520322) 
+ 63 = 57.5 
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Thus for the boy in question we forecast a score of 57.5 in 
comprehension, which is 5.5 points below the average. 

In this particular problem the prediction function is not an 
important one; the value of the regression technique lies rather 
in its ability to show the relative weights of the contributing 
factors in explaining success in the comprehension of literature. 
But, if we wished to predict the probable academic success of a 
candidate for admission to college, so that we might select or 
reject him as a promising or unpromising candidate, or if as a 
basis for vocational guidance we wished to forecast the degree 
of a person’s success in each of several vocations, the prediction 
function might become one of great importance. As a pre- 
liminary to the use of a regression equation for prediction 
purposes, we must, of course, have had previous measurements 
of success in the criterion and in each of the predicting factors, 
on the basis of which we have been able to compute the necessary 
correlations. Assuming that these 7’s will hold for future samples 
of the population, we employ measurements of certain factors 
we can get now as a basis for prophesying scores in a criterion 
that, for the particular individuals for whom we are attempting 
to forecast, can come into existence only in the future. 

So far in this discussion we have put our formulas in particular 
terms—in terms of the number of variables and the particular 
coefficients of correlation in Miss Gresser’s problem. We shall 
now state the formulas in general terms. First is the case in 
which we use z scores, or other types of scores of the same varia- 
bility in all our arrays: 


Zo = Bor-23-.-421 + Boa-134---422 (Regression equation in 
ss + bh Bowiasa.--2k terms of z scores) (187) 


Next we take the case where scores are worked in terms of devia- 
tions from the means of their respective arrays but the variabili- 
ties of the several arrays are different. This involves merely 
substituting z/s, for z. 


k Tı T2 
To = G0 (Gases = + Boz-134-+-% z, (Partial regression 

F $ equation in de- (138) 
=) viation terms) 


spiek F Bow-123-- T 
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Finally we have the case where the several series are measured in 
terms of unequal variabilities and where we are working with 
raw scores. For this we replace « by (X — M+). 


m X, 
Xo = (encase? + Bor-134-.-% = Ei? kei p Borns) 


Ok 
Mı M: Mr 
— 9 (Bosius a + Bo2-134-+0% a + + Bon 1234.-- p 
(Partial regression equation in 
+ Mo terms of raw scores) (139) 


PARTIAL CORRELATION 


When developing the formula for the Pearson product-moment 
coefficient of correlation (page 96), we had occasion to notice 
the relation between a coefficient of correlation and a regression 
coefficient. When bzy stands for the regression coefficient of x 
on y and r,, stands for the coefficient of correlation between these 
two arrays, we saw that Tey = b.,(c,/cz).. The b, we said, is 
the slope of the line in terms of whatever units happen to be 
employed in measuring the « and the y, while the r is the slope 
of the line of regression relating the paired variables when the 
variabilities of the two arrays have been made equal. A precisely 
parallel thing is true of partial correlation. The partial regres- 
sion coefficient, which we learned above to find, is the slope of the 
line relating the paired measures in a criterion and some other 
factor when the influence of certain other factors has been ruled out 
but when our units of measurement are not necessarily of equal 
variability. However, corresponding to the case of the zero order 
rs, the coefficient of partial correlation is defined as the slope of 
this regression line when the variabilities of the two arrays have 
been made equal. Consequently, employing o-z to represent a 
partial regression coefficient of the second order, ro1.2 to represent 
the corresponding partial correlation coefficient, 9.12 (called the 


1The reader must not be misled into thinking that, if we start with z 
scores or other units of equal variability as we did in developing our partial 
regression formulas, we avoid this distinction by having only equal vari- 
abilities all the way along. As a matter of fact, whenever we partial out 
the influence of a factor, we (normally) lessen the variability of whatever 
we have left, and the amount of such reduction differs for different factors. 
So that, when we arrive at the point at which we wish to deal with partial 
correlations, we have unequal variabilities regardless of the nature of the 
measurements with which we started, 
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partial standard deviation) to be the standard deviation of the 
criterion scores when the effects of the inclusion of factors 1 and 


Doourrrtp Work SHEET 
Miss Gresser’s Data—Arranged for Obtaining 840.125 


1 Insert values for} 1| riz.370| 713.396 —ro— 567| +1. 5330 
rs p 
2 Divideline 1by—1|—1| —.370| —.396 +.567|— 1.5330 
3 Insert values for 7’s 1| reg. 642 —ro2— .578]-+1.4800 
4 Multiply items in line 
1, B to I, by Ba —.1369| — . 1465) — -1972 
5 Add algebraically lines 
3 and 4 + .8631)-+ . 4955 — .3682| +1. 2828 
6 Divide line 5 by nega- 
tive Bs —1|— . 5741| +. 4266| —1. 4862 
7 Insert values for 7’s —=ro3— 735] +.9180 
8 Multiply items in line 1, C to 
I, by Cy — . 1568 +.2245| — .0645 
9 Multiply items in line 5, C to 
I, by Cs — 284: +.2114) —.2409 
10 Add algebraically lines 7, 8 and 
9 +. 5587] 5 —.2991) +.6126 
Divide line 10 by negative Cio +.5353| —1.0961 
12 Insert values for r’s —ro— -691| +. 3090 
13 Multiply items in line 1, D to I, by Ds +:1894| +.0778 
14 Multiply items in line 5, D to 1, by De +.1247| +.0257 
15 Multiply items in line 10, D to I, by Diu — 0339 
16 Add algebraically lines 12, 13, 14, and 15 _— 1880! +.3786 
17 Divide line 16 by negative Dis Ei] — .6682 


Substitute values from above table for symbols in the following equations 
(B's for each equation found when each variable in turn is solved for), and 
solve equations for the regression coefficients, 61, Bs, Ba, and By. 

Bs = Ty = 4.3318 (When working for ros.12, the only regression coeffi- 
Ba = (B) (Du) + In cient needed from this sheet is Bs) 
Ba = (B4) (Da) + (B3)(Co) + Io 
Bi = (Ba) (Da) + (Bs) (C2) + (B2)(B2) + Ta 
2 have been ruled out, and 1.02 to be a similar standard deviation 
for factor 1, we have \ 

01.02 


To1-2 = Boi-2 —— 
00.12 
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and correspondingly, 

T10-2 = Bio-2 2 

71.02 

The 11.2 is precisely the same as the ryo.2; since the former is 
the regression of factor 0 upon factor 1 and the latter the regres- 
sion of factor 1 on factor 0; and these are the same in the case of 
the r’s. But that is not true in the case of the 8’s. Let us now 
multiply these two equations together. We shall get the partial 
r squared, since the two of these have the same value. The 
fractions containing the partial sigmas cancel, since one is the 
reciprocal of the other. Hence, 


Tore = V Bore * Bios (140) 


The general case is obviously similar. 


h : A 
TOK 1234006 = V Bok.1234.-bk0-1284... oi eA oaken (141) 


Thus the partial r may be obtained by taking the square root 
of the product of the two corresponding partial regression coeffi- 
cients. One of these, Bor-1234.., we have already learned how 
to compute. The other differs from this only in having the 0 
and the $ interchanged, where the 0 is the criterion factor and 
the k is any other factor in which we are at the moment inter- 
ested. Hence all we need to do is to interchange the position 
of these two variables in the work sheet, then find the new regres- 
Sion coefficient in precisely the same manner as we did the original 
one. The easiest way in which to avoid confusion in this inter- 
change is to write out a table of new equivalents for the original 
correlations. Suppose, for example, you wish in a four-variable 
problem to shift factor 3 into the criterion place in exchange for 
factor 0. You must substitute a 0 for each 3 and a 3 for each 0. 


New ro = old ria 


To = T23 
TS To3 
fis = Tor 
T23 = To2 


Any correlations not containing 0 or 3 are unaffected. Having 
substituted the new values for the Coefficients, forget about it, 
and do just what the work sheet says, as before. The new £ 
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is found at the position in your work sheet exactly corresponding 
with that of the old one. Multiply this new $ by the old one 
that stood at a corresponding position in the original work sheet, 
take the square root of the product, and you will have the required 
coefficient of partial correlation. Your radical will, of course, 
have the ambiguous sign, as all square roots do. Affix to the 
root the sign of the partial 6’s from which it was derived. Both 
of the f’s will always have the same sign if the work has been 
correctly done. This process must be repeated as many times 
as there are partial 7’s to be determined. 

If only one partial r is required, as is often the case, give the 
factor involving it the highest number of the set, and thus place it 
in the column nearest the right of the-work sheet (except, of 
course, the criterion). Then all except a very few of the calcula- 
tions will be the same as the one involved in the original work 
sheet and much labor will be saved. We are giving on an 
accompanying page a sample of the work. We have drawn 
double lines around the parts that involve new computations. 
If the reader will compare this work sheet with the one on which 
the partial regression coefficient for factor 4 in Miss Gresser’s 
problem was computed, he will see that all the others reappear 
here in precisely the same form as there, but some of them in 
different positions. In order to show this, we have copied all 
the work but blocked in with heavy lines the only elements it 
would really be necessary to copy, for the sake of the new 
operations or for the new check. 

The partial r is +/.5203 + .3318 = .415. This is the coefficient 
of correlation between knowledge of vocabulary and ability to 
comprehend literature when the factors of speed of reading, 
knowledge of grammar, and general information are held con- 
stant. It will be observed that the partial r does not differ 
very much from the partial $. 

Let us again get in mind the meaning of partial regression and 
of partial correlation in order that we may raise the question 
of the value of knowing each. The partial r shows the slope 
of the line relating our factors when the influence of certain other 
factors is eliminated and the variabilities of the residual scores 
have been equalized as between the criterion and the factor being 
correlated with it. The partial $ is the slope of this same line 
without equalization of the yariabilities of the residual scores 
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but with equal initial variabilities. In other words, the standard 
deviations of the measures in terms of which we take our original 
measurements are equal in the case of the partial 6’s, but the 
partial standard deviations (which we shall discuss shortly) are 
not necessarily equal. It is here suggested that, since partial 8’s 
are simpler in meaning than partial 7’s, since they are easier to 
compute by our technique (although the reverse was true by 
the old Yule method), and since they are not likely to differ 
much from partial r’s, it would be preferable as a rule to make 
our showings in terms of partial 6’s rather than in terms of 
partial r’s. We can take initial measurements in terms of equal 
standard deviations, but partial standard deviations exist 
only theoretically. To ask how factors as we can know them are 
related to one another when the overlapping elements have been 
removed seems to be more sensible and meaningful than to ask 
how they would be related if they could be measured in terms 
adjusted for a variability that is never concretely accessible but: 
that can exist only in imagination. Nevertheless, in spite of 
the greater convenience of partial 6’s, statistical workers will 
need to make some use of partial 7’s because people are more 
accustomed to thinking in terms of coefficients of correlation 
than in terms of regression coefficients. But when partial 
regression coefficients are announced as evidences of closeness 
of relation between criterion and factor, or as evidence of the 
relative amounts of such relation among the several factors, 
they should be 6’s, not’ b’s; ie., they should be the regression 
coefficients with variabilities equalized as these come from our 
work sheets. If they are otherwise announced, as is sometimes 
done, it is impossible for a reader to interpret them without sup- 
plementary information about the variabilities of the criterion 
and of the several related factors, and even then the interpreta- 
tion is rather awkward. 


MULTIPLE CORRELATION 


The coefficient of multiple correlation is the r to be expected 
between scores on our criterion and the scores predicted for the 
individuals by the partial regression equation. It is, thus, the r 
between the criterion as actually obtained and the criterion as 
predicted from the whole team of related factors, each multiplied 
by its regression coefficient. Since the regression coefficients 
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represent the best possible weights for prediction purposes, the 
multiple r is the highest r that could be obtained between the 
team on the oné hand and the criterion on the other, For 
multiple correlation we customarily employ the capital R rather 
than the lower case r. A formula for multiple R is very easily 
developed. 

At the opening of this chapter we let Zo stand for such predicted 
score on the criterion and gave as its value 


Zo = Bizi + Bate + Baza + + > +. + Bren 


where the subscripts are simply more abbreviated ways of 
indicating the partial 6’s than we used earlier in the chapter. 
Since all of these are in deviation form, and since the multiple R 
is, as said above, the coefficient of correlation between the 
calculated z and the observed one, we have, as our formula for 
multiple correlation, 


Deck — Deo(Brer + Bote + Bats + + + + + Bnn) 


Noi Nousia 


R 


Multiplying through the equation by o.0% and placing the N 
and the Bz with each of the terms instead of with the expression 
as a whole, we have 


Bern aae 4p a a ts guzir 


On the right we have formulas for Biro, B2702, ete. But the value 
on the left side of the equation demands examination. o = 1, 
for it is the standard deviation of a full set of standard measures, 
and the standard deviation of any full set of standard measures 
is 1.1 The oz, is not quite so simple. It is not the standard 
deviation of a full set of z scores but of the z scores that are 
calculated as lying on the regression line. We must find a value 
for such a standard deviation. We shall take first the general 


case, in deviation form. 
1 This may be easily proved as follows: 


Be (zatfe) R ey 
N N No og 
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N o N 

=r 4o = Tog 
oz 

Og = Ty 


Thus the standard deviation of scores calculated as lying on a 
rectilinear regression line is r times the standard deviation of 
the whole set of scores. When we parallel this with the.case we 
want, we have 

Tz = Rez, 


But o = 1. Therefore oz = R. Making in our last R equa- 
tion (above) all of these substitutions we get, 


R? = Biror + Boron + Baros + + + + + Baton 


Taking the square root and substituting again the more complete 
and conventional subscripts for the 6’s and for R, 


Ro.123... = V Bo1.234-.47 01 + Boo.is4...x702 F + + * F Bon 1234..-Tor 
(Coefficient of multiple correlation) (142) 


Normal account must, of course, be taken of algebraic signs. 

When we apply this formula to the data of Miss Gresser's 
problem, we find a multiple correlation coefficient of .73. The 
highest zero-order coefficient was -691. The multiple correla- 
tion coefficient is always higher than the highest zero-order one 
in the team, and the gain by using the multiple regression 
technique for purposes of prediction is indicated by the amount 
by which the multiple correlation is increased over the highest 
simple one. A multiple correlation of .73 means that, if measure- 
ments of a group were taken on the four factors related in the 
problem to comprehension of literature and composite scores 
made for pupils by multiplying their scores on the several tests 
by the optimum weights indicated by the partial regression 
Coefficients, these scores would predict standings in compre- 
hension of literature with an accuracy represented by a coefficient 
of correlation of .73. 
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A problem where prediction is relatively more important than 
it is in this problem of Miss Gresser’s is a small one done at 
Pennsylvania State College having to do with prediction of 
college success by a battery of objective tests given in high school. 
The Carnegie Foundation for the Advancement of Teaching 
gave to high-school seniors in Pennsylvania such a battery of 
tests. When these were worked up with grades in the freshman 
year of our School of Engineering as criterion, they showed the 
zero-order r’s and the partial 8’s set opposite them as follows: 


Variate 


Mathematies® ic scwsiaceelaeiea steaieyae 
Physical science. 
American history. 
Foreign language.... 
Otis intelligence test. Rate 
Engish sirote inoa i NES R 


This battery yields, by application of our formula, a multiple 
correlation coefficient of .78. The reader may compare this 
again with the highest zero-order correlation and ask himself 
whether in attempting to predict college success it is worth while 
to make a battery of all these test factors rather than to predict 
by means of one of them. 


PARTIAL SIGMAS 

The reader Will surely be impressed, as we proceed, with the 
complete parallelism between simple and partial correlation. 
All that we said on pages 110 to 123 by way of interpreting the 
meaning of a coefficient of correlation applies with equal force 
to partial and multiple correlation. The same is true of what we 
said in that chapter about prediction by means of the simple 
regression equation. We learned there that, when we predict 
by means of a simple regression equation, we miss somewhat 
the actual scores we are attempting to predict. The standard 
deviation of these errors we called the standard error of estimate. 
The formula for its amount was dety = oy\/1 — rê. A parallel 
thing is true of prediction by means of the partial regression 
equation. Let us run through a development of a formula for 
this. Let zo be an obtained score on the criterion and % be a 
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score for the same individual predicted from the team of factors, 
each with its best weight. Then in any particular case the error 
will be (zo — žo). We need the standard deviation of these errors. 
We shall treat them, as usual, in deviation form. 
D(a — Ho)? _ Zr 22r Di? 

O? aozo) = ( N HE i N 24 E N 
Dak 2AT 
wo Te Patt 

zo 


= 02, — 2Roz02, + 03, 


oa 


But we showed a few moments ago that oz, = Roz. Making 
this substitution in the two places where the oz occurs, we 
have 


Taty => Th, ag 2R, t Ro, 
= 03, — Roz 
= EI - R°) 


Tair = Ca V 1 — R? 


Replacing the symbol on the left by one that is more conventional 
in this connection and using our more familiar symbol with the 
same meaning for the o on the right, we have 


(Formula for partial 
sigma, the stand- 


Otasi = CV L — Re 1284..-4 ard error of esti- (143) 
mate in partial 
regression) 

This is the theoretical standard deviation of the scatter of 
the scores predicted by the team from the regression line. If 
the reader will hold in his mind’s eye a correlation chart, it is the 
measure of the scatter of the columns in such chart when the 
scores are all placed in the columns to which they properly belong 
as determined by all the other factors that make up the team. 
Whatever scatter still remains in these columns is due to factors 
additional to the ones caught in the team; if all had been included, 
there would be no scatter in the columns and the R would be 
perfect. In the chart of simple (zero-order) correlation, too, 
the standard error of estimate is, as we previously saw, the 
standard deviation of the columns when the scores haye been 
grouped homogeneously with respect to the criterion factor, 
i.e., of the y scores when they have been grouped into columns 
of like z values. This standard error of estimate in multiple 
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regression is called, as indicated above, the partial sigma. 
Analogously the standard error of estimate might have been 
called a partial sigma. Indeed it is sometimes written o1.2 
and called a partial sigma of the first order while the o1.234...n 
is called a partial sigma of the nth order. 


SOME COMPLETED FORMULAS 

For any large number of variables, formulas for partial cor- 
relation and for partial regression coefficients become extremely 
complicated. But it is convenient to have such formulas, 
couched in terms of only zero-order correlations, for three or 
four variable problems. Beyond that point the worker should 
employ the Doolittle work sheets provided earlier in this chapter. 
The formulas given below might readily have been worked out 
with these work sheets, using general terms instead of arithmetical 
quantities, so that no new principles would be involved. As a 
matter of convenience, however, they were worked out by the 
method of determinants, which is another method of solving 
sets of simultaneous equations. 


FORMULAS ror REGRESSION COEFFICIENTS 
Tor — To2r12 To2 — Toii2 
Bore = 3 Boni = (144) 


I= I= 
Et ror(1 = T23) säe roa(Ti2 = T1323) + Toa(r12r 283 aT r13) (145) 


ia a ch) rie(Tie T13r23) + Tia(Tiar23 — 113) 
toa(1 — ris) — fo(ri2 — Tist'es) — Tos(r23 — T1213) 
13 = 145 
i (1 — r3) — ris(tiz — Ti3r28) + Tia(Tist23 — 113) Giso) 
Tor(Tizr23 — T13) — Tolra — r1ar13) + roll — Tia) 145b 
bens (1 = 735) — ri(s — Tisas) F rilis — Tis) a) 


FORMULAS ror PARTIAL CORRELATION COEFFICIENTS 


Tor — Too i2 
To1.2 = — (146) 
eS AE thy = Th) 
To2 — ToiTi2 
Toz- ; (146a) 
mn VO = rh) V0 = rh) 
701.28 
Toal TE 7s) Sr, To2(T12 = Tiar 23) t To3(T127'28 = T13) (147) 


a/ 2roxrost23 +1 = rh = 7s = The 
V1 — ri — Telr — rr) + Tis(T12723 — T13) 
1 Such formulas for problems up to five variables are given in an article 
by Peters and Wykes in the J. Educ. Res., Vol. 24, pp. 44-52 (June, 1931). 


244 STATISTICAL PROCEDURES 


T0218 
Zs Toll — ris) — roltiz — T3723) — Tolas — Ti2r13) (147a) 
V/2rorostis + 1 — rh — Tis — Tis 
v1 Sn T12(T12 — Tistes) + T13(Tigr23 — T13) 


708-12 
Tolra — T3) — To2(r23 — Tretia) + roll — ria) (147b) 
V 2roitozri2 +1— rå — Te. — Tie 
A/L = Th — Pulte — Tras) + T1a(T 12728 — 713) 


RELIABILITY FORMULAS 
The reliability formulas for partial and multiple correlation 
are closely parallel to those for zero-order 7’s and b’s and are 
interpreted and applied in precisely the same manner, The 
standard errors are given below. The P.E. is in each case, of 
course, equal to .6745 times the standard error. ! 


1 — 72h.234-en 

P R aaa (148) 

og, Pre V 1 Ri. 1284 
wean A/N V1 — Riosin 

A AV (1 = R.s...) (1 — Bor-23...B10.23--.) 
VNVI1 — Rios- 
ovl — Rias.. 
Oea = oS (150 
Olas n VNo /\ a Rissi- ) 
1 = Rii.. 
a asn) 
POSSIBILITIES AND PRECAUTIONS 

Under proper conditions the partial regression technique 
is a very valuable tool of research. It has been employed 
extensively in agricultural research and in psychological and 
educational investigations. It has especially promising possibili- 
ties as a tool for research in sociology and economics. ‘The partial 
correlation technique can be made a substitute for controlled 
experimentation, in addition to the two uses discussed earlier in 
this chapter. In controlled experimentation we make two 


(149) 


OR, 


1 For small samples N should be replaced by (NV — a), where a is the num- 
ber of arrays intercorrelated (including the criterion), when the ¢ is to be 
used with Student’s distribution for testing the hypothesis that the true 
regression or Correlation coefficient may be zero, 
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groups—an experimental and a control—equivalent by selection, 
then apply an experimental factor to the former group while 
withholding it from the latter. Thus we are enabled to learn 
how, when all other pertinent conditions are held constant by 
selection and manipulation, the experimental factor is related to 
certain measured outcomes which we use as a criterion of success. 
In partial correlation we hold all of a set of factors except one 
constant in order that we may determine the relation between 
this one and a criterion, but we hold the disturbing ones constant 
by statistical analysis rather than by actual selection and 
manipulation. There are many situations, especially in social 
science research, where we are not at liberty actually to manipu- 
late the conditions; we must accept them as they come, with all 
their entanglements. Here the disturbing factors may be held 
constant by the partial correlation technique, and thus the 
equivalent of a controlled experiment may become feasible. 
The effect of high tariffs on imports, for example, cannot, under 
present notions of the function of governments, be studied by a 
scientifically controlled experiment, but it might be investigated 
by the partial-correlation method. 

But some precautions must be noted regarding this technique. 
Partial correlations are notoriously affected by errors in measure- 
ment. The final values of the partials turn upon small differ- 
entials in the zero-order r’s, especially as the number of variables 
increases. It is, therefore, very important that the validity 
and the reliability of the basic 7’s be high, and the more so if 
the number of variables is considerable. Hence to justify the 
application of this technique to problems of more than three or 
four variables, the number of cases upon which the 7’s are based 
must be large. The formulas as we have given them assume, 
too, rectilinearity of regression between all the correlated arrays. 
It is possible to have corresponding formulas for curvilinear 
regressions, but the formulas would be necessarily very complex. 
Under certain circumstances it might be feasible to translate raw 
scores that involve curvilinear relations into new ones between 
which the regressions would be rectilinear, then compute partial 
7’s in terms of these transmuted scores. 


Exercises 


1. In 1919 six objective tests were administered to 900 freshmen entering 
the engineering school of Purdue University. The purpose was to ascertain 
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which test individually and what combination of tests would best predict 
college success as indicated by grades made in the freshman year., The 
following correlations resulted. Calculate the “weight” with which each 
should be entered in the team, and the multiple R. Find the partial r of 
intelligence with scholarship and all standard errors. 


Exeshs ie ' Tech. 
man Arith- Alge- | Geom- | Intelli- Physice| infor: 
scholar- metic | bra etry | gence Be ian 
ship : 
Arithmetic „51 -= .54 .45 -55 -58 46 
Algebra. .. 46 54 sm 38 38 38 24. 
Geometry . .| .874 | .45 38 — .22 -43 .45 
.| -839 .58 -38 43 -28 — Al 
Technical infor- 
mation...... . 293 46 -24 45 34 41 = 
Intelligence 343 55 .388 22 = .28 34 


2. The following table of intercorrelations was taken from an article in 
The School Review for March, 1924. The number of pupils involved in the 
study was 213. Compute from it the weights of the several relevant factors 
in predicting success in algebra (you to decide which ones are relevant), the 
multiple R, and the reliabilities of the statistics you calculate. 


Achieve-| Kuhl- | Ter- 'Teacher| Lee 
ment man man | Trait | ratings | algebra 
test | intelli- | intelli- | ratings [on apti-| apti- 
scores | gence | gence tude | tude 
Teacher’s marks...... 538, „481 -437 -599 -593 -459 
Achievement-test 
BUOLOR wy mse tan AAST 559 .472 | .388 .531 624 
Kuhlman intelligence. . .834 .359 .440 .690 
Terman intelligence... 287 B41 564 
Trait ratings......... -525 -286 
Teacher ratings on apti- 
tude 474 


Peak EE A N E E S A T a 


3. By the partial regression technique determine the weight of each of the 
factors in Table IV in predicting final grade-point average. 

4. Determine the multiple correlation between the team of factors 
involved in Exercise 3 and final grade-point average. 

5. Compute the partial r between at least one of the tests of Exercise 3 and 
final grade-point average. 


PARTIAL AND MULTIPLE CORRELATION 247 


References for Further Study 


Burks, BARBARA: “On the Inadequacy of the Partial and Multiple Correla- 
tion Technique,” J. Educ, Psychol., Vol. 17, pp. 532-554, 625-630. 
Ezexret, Morpicat: “A Method of Handling Curvilinear Correlations for 
Any Number of Variables,” J. Amer. Statistical Assoc., Vol. 19, pp 

431-453. 

Fisumr, R. A.: “On the Influence of Rainfall on the Yield of Wheat at 
Rothamsted,” Trans. Roy. Soc. (London), Series B, Vol. 213, p. 91. 

Grirrin, H. D.: “Nomographs for Correcting Simple and Multiple Correla- 
tion Coefficients,” J. Amer. Statistical Assoc., Vol. 25, pp. 816-319. 

Houzinenr, Kart J.: “On Tetrad Differences with Overlapping Variables,” 
J. Educ. Psychol., Vol. 20, pp. 91-97. 

Larson, S. C.: “Shrinkage of the Coefficient of Multiple Correlation,” 
J. Educ. Psychol., Vol. 22, pp. 45-55. 

Pearson, Karu: “On the Partial Correlation Ratio,” Proc. Roy. Soc. 
(London), Series A, Vol. 91, p. 492 (1915). 

Sprarman, C.: ‘Disturbers of Tetrad Differences,” J. Educ. Psychol., Vol. 
21, pp. 559-573. 

Warxins, R. J.: “The Use of Coefficients of Net Determination in Testing 
the Economic Validity of Correlation Results,” J. Amer. Statistical 
Assoc., Vol. 25, pp. 191-197. 

Wisnarr, J.: “The Mean and Second Moment Coefficient of the Multiple 
Correlation Coefficient,” Biometrika, Vol. 22, pp. 353-367. (Distribu- 
tion of sample R’s when the true R is zero.) 

Yuun, G. U.: An Introduction to the Theory of Statistics, Charles Griffin and 
Company, Ltd., 1919, Chap. 12. 


CHAPTER IX 
MULTIPLE-FACTOR ANALYSIS 
TETRAD-DIFFERENCE TECHNIQUE 


Under the leadership of Spearman there was developed, 
comparatively recently, a technique for investigating the 
presence of a general factor running through three or more sets 
of variables. The usual combination is of four variables, hence 
the name tetrad differences; though the technique may be extended 
into “pentads” or into combinations involving an even larger 
number of variables. Spearman’s original purpose was to 
investigate a hypothesis that “intelligence” may be explained 
by a common g factor plus a number of specific factors. But the 
tetrad difference technique may be utilized in a variety of research 
situations additional to the one for which it was first employed. 

Since this technique has to do with the presence of a factor 
common to different sets of variables and hence with an element 
of overlapping among the factors, it is most natural to take 
our point of departure for the development of the formulas from 
our treatment of correlation as percentage of overlapping between 
the correlated variables, page 121. On that page we saw that 
if two variables, « and y, have in them a common factor, c, while 
the other elements of the two are unique rather than common, 
then 
o 
Ty 


Try = 


This may be separated into two factors, so that we may write 


E RG . 


Oz Cy 


In this connection it will be more convenient for us to employ 
numerical subscripts rather than literal ones and also to use a 
simpler symbol for each expression of the form, o./oz. Let us, 
therefore, write 

248 
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re = 2-2 = dz 
Ti 2 
where a; stands for o./o; and a for o-/o2. 
Suppose, now, we have not only two variables to which the 
element ¢ is common but four. Then all four of these are inter- 
correlated through the common element, c, so that 


_ Ge Ge _ iS Oc Te 
Tig VS ily Nes ee eee as 
Oo, T2 G2 03 
__ Ge Te Go Te 
Tis Neate yha a LRR I NETEN gag 
Oi 9% 02 T4 
Tc Tc 9 To Fe 
Ti A Olas Ta ie e a) 
Ci o4 O3 O4 


We may now combine certain of these equations by multiplica- 
tion so as to make the resulting products equal to the same thing 
in all the combinations. 


T1234 = AIAZ TyaT 2g = QIQ Tisf24 = AIA 


If, now, we subtract in pairs and designate our tetrads by the 
symbols indicated below, we have í 


tissa = Tigfa — Tisa = O (152) 
tizas = Tirsa — Tiras = O (152a) 
tisag = Titoa — Tiras = 0 (152b) 


Three additional tetrads could be made, but the other three 
would be only these with the signs reversed, so that it is unneces- 
sary for us to write them here. 

It is thus seen that, if a common element c runs through all 
of four sets of variables, the differences between certain pairs 
of products of the 7’s is zero. Consequently, when we find such 
differences to be zero, that finding indicates the presence of such 
general factor. While this converse proposition does not follow 
from our proof, Spearman! gives a proof that it is true. It is 
important to note that all the intercorrelations must be deter- 
mined solely by the common element c. There may not be 
between any pair of elements of the tetrad any further common 
element than c or the r involved will be higher than the presence 


1Spmarman, C., The Abilities of Man, The Macmillan Company, 1927, 
Appendix, pp. iii-vi. (This proposition has, however, been challenged.) 
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of the c alone would explain; and, wherever the two elements 
occur between which there is a higher correlation than. that 
mediated by the c, the product will be raised or lowered in value 
so that it will no longer equal that of the correlations paired 
with it in the tetrad. Our development assumes, then, that the 
factors are entirely uncorrelated except through the common 
element c. 

Spearman approaches the tetrad formulas through partial 
correlations and arrives at the same result as we gave in our 
three tetrads above.! Since his development is short, we shall 
reproduce it here, with changes in terminology, by way of further 
confirmation. 

Let ri2.c denote the correlation between factors 1 and 2 if the 
influence of the common factor were eliminated. Then, accord- 
ing to our formula, page 243, 


eae T12 — Tile 
ke = m 
VpS eV 1 ing, 


But we are assuming, as said above, that c constitutes all the 
elements common to factors 1 and 2, so that 1 and 2 are uncor- 
related except through c. Therefore ry2., equals zero. Hence 


Tig — Tift 


Sed ria Pifas =i); Pig, = Tisto 
Vi- AVi- 1 : š 
Going through the same sort of process for fig.g Tide; Tose) T24-cy 
Tasc, We get 
T18 = Tida) T23 = T230) T34 = Tacto 
Tis = Tite) T24 = Taitto 


We may now combine these equations in pairs by division and 
have 


Tie _ Tile _ 12e, T24 _ T2cK4o _ Tze 
fis Tse Tac Tsa Bete Ts 


Since both fractions at the extreme left in the above sets are equal 
to the same thing, they are equal to each other, Hence 


Te E, mary = Traa; and ryarg 
m np re 1324; AN Tiris — Ti'a = O 


1 Ibid., Appendix, p. iii, 
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By bringing in, in this manner, all the combinations permitted, 
we would arrive at precisely the same set of tetrads as concluded 
our first development. It is, then, established that, following 
the proposition by Spearman cited above, if the indicated 
tetrad differences equal zero, there is an element common to 
all four of the factors. But these differences would be precisely 
zero only in the rarest of instances—only when we had an unusual 
stroke of luck or when our measures were perfectly reliable. 
Because of errors of measurement these differences will deviate 
from zero even when there is present a common factor, and our 
only concern must be whether they deviate more from zero than 
the chance involved in fluctuations from sampling would explain. 
Hence we need a formula for the P.E. of a tetrad difference. 
Although it would be feasible to compute P.E.’s for each tetrad 
separately, that procedure is scarcely practicable since the 
number of tetrads involved in most practical applications is 
likely to be of considerable size. To meet this situation, Spear- 
man proposes the following formula for the average P.E. from a 
set of tetrad differences:' 


PE, = aa VO APE (Rè 


(Average probable error of a 
set of tetrad differences) (153) 


where F denotes the mean of all the intercorrelations, 8 is the 
standard deviation of all the r’s from their mean, and 
4 n—6 


— 972 
ee wee 


, 


epp tam 
A 


the n being.the number of variables intercorrelated while N is 
the number of individuals in the population. In order to 
refute the hypothesis that the true tetrad may be zero, an 
obtained tetrad should be at least four times its P.E. 

In the previous edition of this book we carried the treatment of 
tetrad differences further, deriving formulas for the probable 
error and for the correlation of the common factor with each 
of the constituent tests and giving an illustration of the use and 
interpretation of this technique. We are curtailing the treat- 
ment here and referring the interested reader to this earlier 


1 Ibid., Appendix, p. xi. 
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edition and to such more specialized books as Kelley’s Crossroads 
in the Mind of Man and Spearman’s Abilities of Man, because 
the tetrad-difference technique is now being largely superseded 
in this country by the multiple-factor analysis technique, which 
is more generalized. But the tetrad-difference technique still 
has a limited field of usefulness. The first factor loadings in the 
factor-analysis method are (before ‘“‘rotation’’) exactly the same 
in value as the 7’s between the common factor and the several 
tests in the Spearman tetrad-difference technique. However, 
the generalized multiple-factor method does not impose the 
restriction of no correlation except through the common factor, 
and it has a much more efficient way of finding and evaluating 
additional group factors. 


THE NATURE OF MULTIPLE-FACTOR ANALYSIS 


Within the past few years several methods of multiple-factor 
analysis have been developed which give substantially equivalent 
results. Of these the two major alternative ones are those by 
L. L. Thurstone and Harold Hotelling. Both Burt and Tryon 
have developed rather major variations on these methods, and 
there have been a host of minor variations. In fact the technique 
of multiple-factor analysis is still (1940) so much in the making 
that it is not feasible to foresee into what form it will ultimately 
settle. There is also still some skepticism regarding the ultimate 
value of the present methods of multiple-factor analysis in prac- 
tical research, although all who haye worked with them have 
found them fascinating in theory and in mathematical manipula- 
tion. Nevertheless, the technique has enlisted widespread 
interest and extensive use in research. 

We could not possibly take the space in this book to explain all 
or even the major methods of multiple-factor analysis. The 
interested reader will need to follow them through the original 
expositions by their authors, or through more comprehensive 
secondary accounts (see references at end of chapter). Of these 
latter, Thomson’s The Factorial Analysis of Human Abilities is at 
present the most comprehensive single-volume account, and it is 
readable and authoritative. 

Of all the methods Thurstone’s centroid method is at present 
the most widely known and the most extensively used. We shall, 
since we must choose only one, confine our exposition to it. 
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Thurstone’s exposition of this method is carried in terms of 
matrix algebra and the geometry of hyperspace, and its reading 
is beyond the ability of a layman in mathematics and difficult 
even for those of fair mathematical training’. But, fortu- 
nately, for every geometrical argument there is possible a parallel 
analytic (algebraic) argument; and we have succeeded in develop- 
ing exactly the same argument in a very simple algebraic form. 
For the satisfaction of those who have read Thurstone’s presen- 
tation, we shall show the parallelism of the two derivations in 
footnotes, so far as our argument proceeds. We do not go into 
all the ramifications reported by Thurstone, by any means; but 
for every one of them that is couched in geometrical form there is a 
parallel and equally cogent analytical form. It is only the deriva- 
tion of the formulas in our presentation that differs from Thurs- 
tone’s; the arithmetic is exactly the same; and, of course, the 
outcomes are the same. We shall take this occasion to say, how- 
ever, that, if the reader wishes to be at home in the literature 
of mathematical statistics, he must learn the geometry of hyper- 
space because so many of the fundamental developments in 
statistics are couched in terms of it. 

What is multiple-factor analysis? When we measure such a 
trait as “general intelligence,” we may not be measuring a uni- 
tary attribute. It is conceivable that we are catching in the 
measured trait a component of reading ability, another com- 
ponent of ability to visualize geometric forms, of ability to sce 
relations, ete. It is likely that different tests of what is supposed 
to be this same function will measure each of these constituent 
factors with different degrees of effectiveness (validity). Mul- 
tiple-factor analysis attempts to determine how many such 
independent factors are needed to account for our scores as 
revealed by their behaviors in a set of intercorrelations from a 
number of tests which are alleged to be tests of the same func- 
tion and to determine how heavily each of the tests is weighted 
with each of these factors. A basic consideration to keep in 
mind is that these factors are to be uncorrelated with one another 
(because otherwise they would not be independent) and that 
we wish to account for the test scores with the smallest possible 
number of factors and hence wish to take out on each successive 
trial the maximum possible load. 

1 Taurstons, L. L., The Vectors of Mind, University of Chicago Press, 
1935. 
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THE BASIC EQUATIONS 


Let 1 and 2 stand for two different abilities possessed by an 
individual, and let zı and z be standard scores indicating the 
true amount of these abilities possessed by him. Let a; and az 
be the corresponding weightings (percentages of perfect validity) 
with which test a measures these abilities. Then, if Z is the 
individual’s standard score on the test as a whole, 


(A) Za = Q1 + a2 
And for a second test, b, 
(B) Za = bizi + boza 


Multiplying (A) by (B), 
Zala = aibi? + asbaz3 + arbozize + aebierze 
Summing for the whole population and dividing by N, 
2 2 . 
Bae = abs ZAE + as ZA + ash, ZA + ab, ZA 


The value on the left of the equation is ra, because the mean of 
the paired products of z scores is the coefficient of correlation. 
22*/N is the variance (o?) of a set of z scores, hence equal to 1. 
Furthermore, the two abilities are uncorrelated, by hypothesis, so 
that 2z:22/N, which is the coefficient of correlation between 
them, equals zero. Therefore 


; Ta = abi + adb +0+0 
If there were more tests, 


Tac = Q31 + AaC2; Toa = bidi + beds; ete. 


Square Eq. (A), sum, and divide by N, and observe that the 
variances are 1 and that the abilities are uncorrelated. 


. B= azi + ake} + 2aagiza 


IZ 22? =z 2: 
VANTAN + 2a 
data= 


If there is an error factor and a specific factor, uncorrelated 
with the others, denoted by e and s, respectively, 


~ eee 
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ai + a3 +a) +a; =1 
a+a=1-(a@+a)=1-qg= 
the communality of the test. The reliability coefficient of the 
test would be exactly the same as the communality except for 
specific factors. By reason of these the reliability is likely to be 
a little higher than the communality, but at any rate the com- 
munality has the reliability coefficient as its upper limit. 


FINDING THE First FACTOR Loapines 


We wish now to find the values for the factor loadings on the 
several tests. That is, we wish to find values for a1, a2, bi, and 
bs. Moreover, we want to account for the test scores with the 
smallest number of factors that is possible. Hence we wish 
to take out each time the maximum loading for each factor 
isolated. In order to extend our problem a little further than 
above and yet keep within convenient limits, let us assume 
three abilities and up to & tests. These may be laid out and 
summed as below. 

k will stand for any test. k, will be any test loading in 
factor 1, ka any loading in factor 2, ete. Dk, will be the sum 
of the factor loadings in all the tests combined in respect to 
factor 1, ete. y 


Taa = M101 + Qaa + Asas 
Tan = Ab) + Gabe + agbs 
Tox = dik; + aaka + asks 
Drax = 01Xkı + 22ks + asrks 


Doing a similar thing for the correlations of each other test with 
all the other tests, then summing for these partial sums, and let- 
ting Xr be 2 Dry, and therefore the sum of all theintercorrelations, 


(C) Drak = a1Dky + agVke + asdks 
Ero = by Dkr + boDke + bsDks 
Tryp = kiki + keke + ksdks 
Dr = DkyDky + DkeDke + Eka Eka 
2r = Ik + Be + hs 


Now comes the crux of our development. We want Ek, to be a 
maximum. Therefore Ek, and Sk; must equal zero, because if 
either differed from zero by any amount, whether positive or 
negative, the value when squared would be positive and hence 


ll 
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would reduce the value of Dk; and violate our assumption that 
‘Dk, is to be a maximum.! Therefore 

(D) Ski = Br, and Uk = »/3r 

From (C) it follows that a:2=k; = Drax, because the two other 
terms in equation (C) must be zero for the reason given above. 
Substituting from (D), 


a, St = Dra; a = 


Zak 


Similarly, 


by = (154) 


We have now reached the climax of our development and are 
ready to apply our technique to a concrete problem and to find 
numerical values for the first factor loading. In Table XIV are 


TABLE XIV.—INTERCORRELATIONS AMONG THE TEN TwsTs 
(Sometimes Called the Correlational Matriz) 


a b c d e 1 o h i j 


(.834) | .544 | .500 | .488 | .545 | .642 | .834 |, .715 | .453 | .366 
-544)| .282 | .293 | .320 | .352 | .473 | .450 | .272 | .101 


a 
= 
È 


a 
b 

c -500 +282 | (.529)| „483 381 -438 «498 -529 -359 301 
d -488 | .293 | 483 .571 648; | .563 | .656 | .368 | .332 
e -545 «320 381 -571 | (.622)| .622 -568 595 365 +344 
f 

0 

h 

i 


2 
& 
2e 


-642 | .352 | .438 | .648 | .622 | (.729)| .703 | .729 | .419 269 
+834 -473 498 563 -568 -703 | (.834)| .723 «507 B64 
+715 | .450 | .529 | .656 | .595 | .729 | .723 -729)| .621 457 
+453 | .272 | .359 | .368 | .365 | .419 | .507 | .621 5 
i -366 | 101 | .301 | .332 | .344 | .269 | .364 | 1457 | .393 | (.457) 
ries] 5.921 | 3.631 | 4.300 | 5.058 | 4.933 | 5.551 | 6.007 | 6.204 | 4.378 | 3.384 
ki | «842 | .517 | .612 | .719 | .702 | .790 | .g03 | .s82 | .623 | .481 


> 
& 
& 
S 
a 


o 


Br = 49,427; Vr = 7.030433; —L = 0,1422387 
VIr 
e E E E a Ct 
displayed intercorrelations among ten tests designated by the 
first ten letters of the alphabet. In the conventional literature 
such a table is called the correlational matrix; but it is nothing 
but a systematic arrangement of the correlations of every test 
1 This is the first departure of our derivation from ‘Thurstone’s, Making 
Ski a maximum js identical in principle with the fact that passing the axis 


through the centroid necessarily makes the projections on axis J a maximum 
and the projections on the other axes a minimum, 
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in the series with every other test. Thus ra is .544, rya is .648, 
etc. The formula we just derived, (154), states that we must add 
all the 7’s in the first column for Sraz, all in the second column 
for rs, etc. Those sums are entered in the row labeled Zrias 
in the table. Formula (154) also says that we must get the first 
factor loading in test a, (a1), by dividing Zraz by the square root 
of the sum of all the intercorrelations in the table. This sum is 
49.427 and its square root is 7.030433. But it will be easier to 
multiply by the reciprocal of 7.030433 than to divide by the 
number itself. The reciprocal is 0.1422387. Multiplying ra 
(which is 5.921) by this we get .842. That is the weighting of 
factor 1 in test a. Similarly, multiplying in turn the summa- 
tion at the foot of each column by 0.1422387, we get the weight- 
ings of factor 1 in each of the nine other tests, as shown in the 
last row of the table. 

Examination of Table XIV will show certain correlations 
enclosed in parentheses, the ones constituting the diagonal of the 
matrix. They are the estimated communalities, referred to earlier 
in this chapter. In a full set of intercorrelations there would 
appear such values as Taa, Te, ete. ‘These would be the self-corre- 
lations of the tests in respect to the factors common to the tests 
which stillremain. But they are seldom known. Even if we had 
reports on the reliabilities of the tests, these reliability coefficients 
would not be exactly the same as the communalities wanted, 
because the reliability coefficients would be raised above the 
communalities by reason of the presence of a factor specific 
to the several pairs of tests in addition to those factors ¢ommon. 
to all the tests. ‘Thus the communality is a little lower than the 
reliability coefficient. It is also true that any inter-function cor- 
relation is somewhat lower than the reliabilities of the correlated 
tests, except for chance fluctuation. So the highest inter-func- 
tion correlation is not a bad guess at the communality. Hence, 
following Thurstone, we hunt in column a of Table XIV the 
highest r in the column and write it in the diagonal as the com- 
munality. For Faa that is 834. We do likewise in each of the 
other columns in the table. 


FINDING A SECOND FACTOR 


In the development with which we opened this section, we 
had equations of the type 
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Tap = aibi + daba + dads 


We know the values of rs, of ai, and of bı, because we were 
given the former and have just found the latter two. Transpos- 
ing so as to get these known values on one side of the equation, 
we have 

aba + asbs = Ta» — arbi (155) 


That is, if there are in the measures any other common factors 
than 1, there will be certain residuals in the correlations after 
factor 1 has been removed. These residuals are found in the 
manner indicated by Eq. (155). In this particular case the 
residual remaining as a2b2 + asbs would be found by substituting 
the value of ra, and the obtained values of a; and bı. 


Tanı = 544 — (.842)(.517) = +.109 


The residual we write in the proper space to represent ra» in 
the table of first residuals, Table XV. But instead of writing 
its sign directly by the r value, we shall enter it at the top of 
the cell space directly at the left of it. That is done because we 
shall later wish to make some changes in these temporary pigas: 
Similarly 


Tacı = .500 — (.842)(.612) = —.015 
Tear = 483 — (.612)(.719) = +.043 


and so on with all the others. We enter all of these in the table 
headed First Residuals in the cells immediately to the left of 
the respective residuals (Table XV). This process includes the 
residual communalities, entered in the diagonal in parentheses. 
Now we wish to isolate from this table of residuals a second 
common factor. The argument is precisely the same as it was 
in the case of factor 1, so that we should be able to employ a 
second time the same Daie We proceed, therefore, to add 
our successive columns of 7’s as we added the columns of Table 
XIV. But a strange thing happens; all sums come out zero, or 
practically so. Somefurther algebraic manipulation would show 
us that all these sums must be zero if our arithmetic has been 
correct. Of course, not only would the sums of the separate 
columns be zero*but the sum of all the intercorrelations would 
be zero, since this latter sum is obtained by summing the column 
totals.. Thus, when we attempt to get our second factor load- 
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ings, each of them will be 0/0. This is an indeterminate expres- 
sion and will get us nowhere. We must try some scheme of 
avoiding this pitfall. 

In this dilemma Thurstone has proposed that we change the 
signs of some of the tests. Any test score may be either positive 
or negative according to the way in which it is oriented. If, 
for example, a positive score means “tactful,” the same score 
with the negative sign would mean ‘‘tactless.’’ It is, therefore, 
entirely legitimate to imagine all scores on the test reversed in 
sign, so far as the remaining factors are concerned.! The fact 
that we could not in practice reverse the part of the score that 
remains after taking out of it factor 1 need not bother us, because 
we are making the change merely conceptually and shall return 
to the original sign when our purpose has been met. To change 
the sign of all scores in one of the arrays will change the sign of 
its correlation with every other array. We shall change the signs 
of such tests as will let us take out of our correlations at the 
next attempt the largest possible loading. There are several 
methods of doing this, but we shall choose that one of Thurstone’s 
proposals which gives unique results; i.e., one that could be 
followed in precisely the same manner by persons working 
entirely independently. 

We shall sum the residuals (Table XV) algebraically by 
columns, not including the communalities, entering the sums in 
row Ło After all the columns have been thus summed, we 
shall find the one having the highest negative sum.? In this 
trial that is test b. We mark zı above that column to help 


1 Our changing some of the signs in the first and second residuals is identi- 
cal with Thurstone’s procedure. But, whereas in a geometrical system one 
must think of reflecting the test vector from one hemisphere to the opposite 
one, we merely think what would happen algebraically if we changed the 
signs of all the items of one of the variables when computing an r between it 
and another, 

*In the geometric system one passes the axis through the most dense 
cluster of points. ‘If an observer were stationed at the origin and he could 
see in space of (r — 1) dimensions, he would discover clustering of the 
points if a second factor is conspicuously present . . . We want the second 
axis to go in that direction.” (Thurstone, A Simplified Method, pp. 4-5): 
The precise equivalent in our system is the choice of the column that aggre- 
gates arithmetically large r’s. For arithmetically high correlations make & 
clustering of points, since the correlations are expressed by the cosines of the 
angles between the direction vectors, and large cosines mean small angles. 
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us remember that we chose it first for “reflection.” We now 


“reflect” (change all signs in) test b, entering the new signs 
just below the original ones in the narrow column at the left. 
But, having changed the signs of the r’s involving b in the column, 
we must, of course, change them also in row b, indicating that 
fact by an zı after the b. Now we again sum our columns, dis- 
regarding communalities, and get a new row of sums, 2_1. In 
that row the greatest negative value is in the column for test a. 
So we change the signs of all correlations in column a and in row 
a, indicating by an zz that we have done so. After summing 
again, we reflect test g. Upon summing after that change, we 
find that all the signs are positive. We are now through with 
the process of reflection as far as this table is concerned. 

But before we proceed further, we shall look to our communali- 
ties. In fact it would have been better to do this before reflecting 
any signs, just as soon as we had tested the correctness of our 
arithmetic by finding that the columns sum to zero. We enclose 
in parentheses the communalities we brought over from the 
previous table to show that we shall have no further use for 
them. The argument about what our communalities shall be 
in this table is exactly the same as that advanced in connection 
with Table XIV. So here again we take as our communality 
in a column the highest inter-function correlation in that column, 
entering it in the diagonal with positive sign. All communalities 
must be positive in sign because they are self-correlations, no 
matter whether the inter-function correlations from which they 
were inferred were positive or negative. It would not really 
be wrong to use the residual communalities standing in paren- 
theses. If our guess about the communality in Table XIV had 
been correct, the residual would be correct. But, since our first 
estimate was merely a guess, we choose not to trust it very far 
but make a new estimate by taking for the communality the 
highest inter-function residual in the column. 

With these changes in signs and in communalities completed, 
we place our final signs in front of the residual correlations 
and proceed to find a second set of weightings by exactly the 
same technique as we employed before in getting factor 1 load- 
ings. These stand in the row labeled kə» But they are the 
values with some reversed signs. In order to get back to the 
values actually used in the tests, we must reverse the signs for 
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all those tests for which we changed signs to get them. That is, 
we must reverse the signs of the obtained factor loadings for tests 
a, b, and g. The corrected loadings are given in the last row 
of the table. 


Frypine A THD Factor 


Our argument continues to recur in the same form. We left 
off above with 


Tob-1 = aba + azb 
Transposing 
abs = Tanı — dade (156) 


That is, there may still remain in our correlations certain resid- 
uals, owing to a remaining factor or factors. We know ras.1, a2, 
and be, from Table XV, so that we can find the residual by 
exactly the same technique as we used in getting the first residuals, 


Tob.12 = Tab — G2b2 = .109 — (+.359)(+.373) = —.025 


We write this in the proper place in the table of second residuals 
(Table XVI). Notice that we use the final signs in the r’s, even 
if some are reversed as compared with their original values, and 
the kz values before the last correction in the factor loadings. 
But we must keep track of the number of reversals of each test 
so that we may ultimately restore the original signs. Continuing 
the process started for the one cell just above, we get the remain- 
der of the residuals for our Table XVI entirely analogously to 
the manner in which we got the first residuals for Table XV. 
Then we test our arithmetic by summing the columns, algebraic 
signs considered and communalities included. If the arithmetic 
is correct, all columns will sum to zero, or practically so. If not, 
we must discover the error before proceeding further. We 
then proceed just as in Table XV with the reflection of tests, 
achieving all positive signs after five reflections. Of course, if it 
appeared clear that we could not reach all positive signs, we 
would stop reflecting when we had attained that goal as nearly 
as feasible. Then, if not before, we put in new communalities 
and get the ks and the ks, rows of loadings in the same manner as 
in Table XV. 
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We can continue to do this through as many factors as we 
wish, until our residuals are so small that they are obviously 
due to chance. 


TRANSFORMING THE VALUES 
(Equivalent to Rotating the Axes) 


Now we collect all our weightings so far determined into a 
summary table—Table XVII. (Pay no attention now to the 
numbers in parentheses; we shall refer to these later.) The 
columns appear to belong to two different systems; all weightings 
in factor 1 have positive signs, but about half of the entries in 
factors 2 and 3 have negative signs. It may be possible to make 
the three factors more comparable by transforming all of them 
to the same sort of system, having the same zero point. As a 
matter of fact there is no unique solution to a multiple-factor 
problem; an infinite number of different values would satisfy 
the fundamental equations if only these values maintain the 
right relations inter se. So, in order to have a unique solution, 
let us impose the condition that all the weightings shall be 
positive (so far as possible) and that the number of zero loadings 
shall be maximized. We can control this last condition at least 
to the extent that each column except one shall have at least one 
zero and (if a positive manifold is possible) that its loadings shall 
run from zero up. But we must, of course, keep our two basic 
types of equations intact as to total value. If the primed 
symbols represent new values, it must remain true that 


aibi + abs + agbs = arbi + abs + asbs 


because each of these must equal fa, which has a fixed value. 
Also 


ai? -+ aj? + aj? = af + af + a = hy 
because these are the communalities, and they have fixed values 
which may not be changed capriciously. Of course, the cor- 
responding equations must hold for the other tests. But we may 
shift values within the equations, provided the values of the 
equations as wholes are kept intact. (This is Thurstone’s prin- 
ciple of rotation of axes.) But to attack the problem of trans- 
forming all factor loadings at once gets us into complicated 
algebra. So we shall hold one factor constant while we manipu- 


Es 
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X 

Tate the other two, then hold one of the first pair constant while 
we manipulate the left over one paired with one of the prior 
factors. (This is Thurstone’s principle of rotating about one 
axis at a time.)! Taking the first two factors while holding 
factor 3 constant, 


aft + ag? = af + af = Mna 
Let a equal 0. Then 
ay? +0 = aj +43 = ia 
Hence a) = has. Again 
aibi + a,b, = aibi + aba 
Since a, = 0, 


aibi + azb 


alb, = abı + azbz; and hence b; = a 
1 


We could now get new values for aj, as, and bj, since all the 
required values in these equations are known. For any other 
required new loadings we could get new values by constructing 
similar equations involving them. 

Although we could use the above types of formula for getting 
a set of transformed values, we can simplify them further for 
computational purposes by a little algebraic manipulation. 
Reproducing them in generalized form, where k may stand for any 
test and m may stand for the test with the lowest factor loading 
in the independent function when corrected for uniqueness (7.¢., 
when the loading has been divided by the square root of the 
communality, i.e., when divided by hr), we have 
(B) ie maki $ mat 
The mı, ma, and m, will recur in the computation for each row, 
so we may as well make the required divisions once for the whole 
set of tests. Letting m/m be represented by mio and m/mi 
by Mmao, (E) becomes 


K, = mioky + Moka (157) 


1 For a new scheme by which several rotations can be made simultaneous! y 
see L. L. Thurstone, “A New Rotational Method in Factor Analysis,” 
Psychometrika, Vol. 3, pp. 199-218 (1938). 
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For k, we draw in addition upon the following two propositions: 


(F) mi + m3 = hh Since m, = 0, mi = Amie 
Hence dividing formula (F) by mi2, 

(@) mio + ma = 1 

(H) HFR =H, 


Multiply (G) by (H), then subtract formula (157) squared, 


miski + miki + mok? + m3okł = hg, 
miki + 2romeokike + mèk = ki? 
Migks — 2Mmiomokiks + m3,k? = (hia — ki) = ky? 


Taking square root, 
kz = Miokz — maoky (158) 


Formulas (157), and (158) are the ones we use for getting the 
new weighting in the rotated system.! 

We shall make a numerical application of these formulas to 
the transformation of values in factors 1 and 2, columns headed 
1 and 2 in Table XVII, substituting for k the several tests in 
succession and letting factor 1 be the dependent one and factor 2 
the independent one. We want the values in factor 2 to be so 
transformed that they will extend from zero up in the plus 
direction. 

Test b has the lowest negative loading in factor 2 when cor- 
rected for uniqueness, če., when divided by p12. Consequently, 


1 Our algebraic system of transforming loadings is identical in outeome 
with Thurstone’s rotation in hyperspace. Ordinarily the rotating must 
be done about one axis at a time. Parallel to this, we transform two columns 
at a time. Guilford (p. 489) quotes from Thurstone the following formulas 
for computing new loadings, and the bases for these formulas are laid in 
Thurstone’s Vectors, pp. 203-205: 


ky’ = kı cos ġ + kasin 
ka! = k cos ġ — kı sin ọ 


where œ is the angle of rotation. If the reader will visualize, or actually 
construct, a plotting of the tests on a plane for two reference vectors, will 
rotate the axes so as to make axis I pass through the test with the lowest 
negative loading, m, he will find that mo iS cos @ and that map is sin ¢. 
With these substitutions our formulas become identical with those of 
Thurstone’s rotational system. 
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let it be m. 
b, = 0, by hypothesis. Hence bf = ~/bj + b3 = ho» = +.638 
— br ŠT _ ogy. by | —.378 
Mio = 7 = 638 ~ .811; and mao = vu = 338 = —.585 


For each new factor 1 loading we shall need to multiply the 
original factor loading by mo, which is +.811, and add to that 


TABLE XVII.—Factor Loapincs BEFORE ROTATION 
(Starting Factor Loadings in Parentheses) 


1 2 3 h? 
a +.842 (.8) —.359 (.3) +.079 (.1) 844 (.74) 
b +.517 (.7) —.373 (.0) —.124 (.1) 422 (.50) 
c +.612 (.3) +.027 (.3) +.043 (.3) .877 (.27) 
d +.719 (.3) +.261 (.4) —.237 (.6) 641 (.61) 
e +.702 (.5) +.103 (.4) —.233 (.3) 558 (.50) 
áj +.790 (.6) +.063 (.3) —.244 (.5) -688 (.70) 
g +.863 (.7) —.248 (.4) +.050 (.3) 809 (.74) 
h +.882 (.7) +.142 (.5) +.067 (.3) 803 (.83) 
i +.623 (.4) +.086 (.5) +.267 (.0) 467 (.41) 
J +.481 (.2) +.213 (.6) +.314 (.0) .875 (.40) 


the product of the corresponding factor 2 loading by m20, which 
is —.585; add these products algebraically. For this purpose 
it is convenient to write +.811 beneath the factor 1 column 
and —.585 beneath the factor 2 column, where reference can 
be easily made to them. For the new factor 2 loadings each 
original loading in factor 2 must be multiplied by mio and from 
that must be subtracted the product of mz and the correspond- 
ing original factor 1 loading. For this purpose it is most con- 
venient to write, under column 1, mao with reversed sign (+.585); 
and, under column 2, myo (+.811); then add the products alge- 
braically as before. We shall illustrate a few of these sums of 
products. 


a = mys + Mam: = (.811)(.842) + (—.585)(—.359) = .893 
mats + Mec, = (.811)(.612) + (—.585)(+.027) = 481 
al = mwa: — Mmaa = (.811)(—.359) — (—.585)(.842) = .201 
1 = moca — Matı = (.811)(+.027) — (—.585)(.612) = .380 
After thus transforming the loadings of factors 1 and 2 with 
factor 3 held constant, we proceed in the same manner to trans- 
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form loadings in factors 3 and 1 with factor 2 held constant. 
The results of these transformations are shown in Tables XVII, 


Taste XVIII—Facror Loapines AFTER ONE ROTATION 


1 2 3 h? 
a +.893 +.201 +.079 844 
b + .638 0 —.124 422 
c +.481 +.380 + .043 .378 
d +.430 +.632 — .237 .640 
e +.509 +.494 — .233 .557 
M +.604 +.513 —.244 .688 
g +.845 +.308 +.050 .808 
h + .632 +.631 +.067 .802 
i +.455 +.434 + .267 467 
yh + .265 +.454 +.314 875 


a Sh a eee ee 


XVIII, and XIX. Notice that in all the tables the communali- 
ties (h?) remain the same within the limit of accuracy determined 


Tasty XIX.—Facror Loapinas arrer Two ROTATIONS 
(Starting Factor Loadings in Parentheses) 


1 2 3 h? 
a +.744 (.8) | +.201 (.8) | +.501 (.1) 845 (.74) 
b +.619 (.7) © (.0) | +.200 (.1) 423 (.50) 
c +.401 (.3) | +.380 (.3) | +.270 (.3) -378 (.27) 
d +.491 (.3) | +.632 (.4) | +.0 (.6) 641 (.61) 
e +.558 (.5) +.494 (.4) +,042 (.3) .557 (,50) 
J: +.647 (.6) | +.513_(.3) | +.078 (.5) .688 (.70) 
g +.716 (.7) +.303 (.4) +.452 (.3) -809 (.74) 
h +.521 (.7) | +.631 (.5) | +.364 (.3) 802 (.83) 
i +.270 (.4) +.434 (.5) | +.454 (.0) -467 (.41) 
j +.080 (.2) | +.454 (.6) | +.403 (.0) .875 (.40) 


a rn a a 


by the number of decimal places to which the computations have 
been carried. To sum the squares along the rows and thus find 
the h”s unchanged is an important check on the correctness 
of the arithmetic. 


INTERPRETATION OF APPLIED FACTOR ANALYSIS 
We shall now speak of the numbers in parentheses in Tables 
XVI and XIX. They are the known loadings which the process 
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should have given back if it has a realistic meaning. They grow 
out of an effort on the part of one of the authors to test empirically 
the validity of multiple-factor analysis. We arbitrarily set 
weightings for each of ten “tests” in each of three common 
factors and then added for each test a specific factor weighting 
that would make the communalities nearly 1.00. Then we made 
four independent tosses of 12 pennies for each of 100 hypo- 
thetical “subjects,” one toss to represent his real ability in each 
of the four factors. Thus a subject achieved a “score” on a 
test which was the sum of the number of heads turned up for 
him in each of the four fundamental abilities multiplied by 
the loading assigned to those abilities for the particular tests. 
Suppose, for example, subject 1 had a score of 6 heads in factor 
1, 3 in factor 2, 5 in factor 3, and 7 in the specific factor for test a. 
In test a the assigned loadings were .8 for factor 1, .3 for factor 2, 
.1 for factor 3, and .5 for the specific factor. His score on test a 
would be (6)(.8) + (3)(.8) + (5)(.1) + (7)(.5) = 9.7, multiplied 
by 10 to avoid decimals = 97. Thus scores were made up for 
all of the 100 hypothetical subjects for each of the ten tests, 
Intercorrelation coefficients were then computed among these 
tests and entered in Table XIV. This duplicates the situation 
to which factor analysis attempts to get back by mathematical 
analysis. 

It will be observed that there is a fair amount of agreement 
between these “starting” values and the ones which accrued 
from the analysis. The coefficients of correlation between 
starting values and final values are 


Beror ROTATION 


Factor 1 Factor 2 Factor 3 


.63 -70 —.78 


AFTER Two ROTATIONS 


r | :84 | -73 —.72 


These are highly significant 7’s, though they are less than 


perfect. 
But notice the negative r for factor 3. That is a phenomenon 
of great importance to the practical worker, although little 
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attention seems to have been given to it in America. It is just 
as possible for any factor (except the first) to come out with 
the signs of all factor loadings reversed as with the correct 
signs. That is inherent in the mathematics of the situation. 
An inspection of Tables XVII and XVIII will reveal that the 
reversal of, all signs in any one of the three columns, or in all of 
them, would not affect the cross products and hence would not 
affect the power of the factor loadings to give back the correct 
correlations and the correct communalities. Consider 


aıbı + abe + sbs = Tap 


Perform this operation with factor loadings from Table XVII 
and get ra. Now suppose all signs in any one of the factors 
were reversed and again compute ra. It will be found to be 
unchanged. Thus, reversed signs will support precisely the 
same matrix of correlations and the same communalities as 
correct signs. Thompson has observed and commented upon 
this phenomenon in connection with both Thurstone’s and 
Hotelling’s method. This uncertainty about signs certainly 
complicates the interpretation of the outéomes from multiple- 
factor analysis. We must be prepared to take an arithmetically 
large loading in a test as indicating that the test discriminates 
with respect to the factor, a large negative weighting having 
possibly the same meaning as a large positive one. 

If we could know that the signs are reversed, we could rotate 
out this reversal. If in Table XVIII we had transformed factors 
3 and 2 instead of 3 and 1, with 2 the dependent one, we would 
have let bs = 0, and the transformation technique explained on 
pages 264 to 268 would have resulted merely in changing all 
signs in factor 3 and shifting it into column 2, while it would have 
left unchanged the weights in factor 3. but shifted them into 
column 2. Then another transformation of 3 and 2 with 2 
dependent and a final transformation of 2 and 1 to get rid of a 
small negative loading in test b would have yielded a wholly 
positive manifold in much closer agreement with the known 
original loadings than those of Table XIX, all having correct 
signs. By this procedure the r’s would be as follows: 

Between final first factor loadings and original weighting, +.92. 


Between final second factor loadings and original weightings, +.95. 
Between final third factor loadings and original weightings, ++ .90. 
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These are very high 7’s, and show high validity for the 
technique. The reason they fall below 1.00 is probably on 
account of unreliability due to sampling in our penny tossing 
and on account of the approximation involved in taking the 
highest inter-function r of a column as the communality. 

In the geometrical method the equivalent of what we did in 
the transformation mentioned on page 270 is to rotate the axes 
through 90 deg. All the signs of any factor can always be 
reversed by rotating through 90 deg. But the hitch is that 
we never can know with certainty whether thus rotating through 
90 deg. will bring us nearer the truth or farther from it, We 
are on highly speculative ground and can only do in this respect 
what looks most plausible. In the geometrical method the 
worker merely rotates his axes graphically until he gets what 
looks like most plausible results. In our empirical work with 
our method of transformation we appear to have gotten best 
results in a three-factor table by first transforming factors 2and1 
with 1 dependent, then factors 3 and 2 with 2 dependent, and 
finally factors 2 and 1 with 1 dependent. An analogous proce- 
dure with more factors would be to transform the factors in suc- 
cessive overlapping pairs beginning at the first, each time making 
the earlier of the pair the dependent one; then repeat the process 
in the same order to eliminate remaining negative terms. But, 
while we see a glimmer of theoretical basis for this, a satisfactory 
theoretical basis for determining a unique method of transforma- 
tion or rotation still awaits discovery. 

The reader is urged to try this type of transformation on our 
Table XVII as an exercise; and he is especially challenged to 
seek to discover some theoretical basis for a unique solution. 


An APPLIED EXAMPLE 


But the fact that our situation was a made-to-order one made 
the procedure in the above example work out very smoothly. 
Tt was easy to get in it a positive manifold (all positive loadings) 
and a common factor because the situation had been set up that 
way. But not all practical applications work out so neatly 
as chat. If the tests are of such a nature that some of them are 
inherently negatively intercorrelated in respect to any one 
factor, it will be impossible to get a positive manifold. Such 
is the ease, for example, in the Bernreuter Personality Inventory, 
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where some of the items measuring presence of neurotic tempera- 
ment are so stated as to require a plus mark in the scoring while 
others are so stated as to require a negative mark. Here we 
could get a positive manifold in a matrix made from the items 
only if we transposed the scoring key so that all items would be 
oriented in the same direction. 

Another condition under which we may fail to get a positive 
manifold is when the tests are short or populations small, so 
that the r’s have low reliability and some weightings are nega- 
tive by chance. Among the exercises at the end of this chapter 
we give a table of intercorrelations among a set of tests used 
for predicting achievement in the Engineering School at the 
Pennsylvania State College. These resulted in the factor load- 
ings of Table XX as taken from the original work sheets. When. 
the technique of rotation described above was applied to this 
table, factors 1 and 2 being transformed with the others held 
constant, two small negative values occurred in factor 1 at the 
first rotation. Thus there was little promise from further 
rotations of other factors in the same direction involving factor 1. 
When other rotations were made about the most promising axes, 
they also involved some negative values; hence a positive mani- 
fold could not be achieved, and the best that could be done to 
make the factors comparable in meaning was to balance them 
against one another in such a way that each would have about 
the same extent of negative signs. This is shown in Table XXI. 

This difficulty is much more easily resolved in the graphical 
method of rotation than in our algebraic method. The Thurs- 
tone practice does not take the meaning of a positive manifold 
strictly; it accepts as zero small negative loadings, attributing 
them to unreliability. So, loadings down to —.20, or even down 
to —.40 if the population is not large, are accepted as not violating 
the principle of a positive manifold. The axes are rotated until 
they pass through the densest cluster of points, within the 
liberal definition of positive manifold just mentioned. This 
sort of process is more awkward by our algebraic procedure 
because it is not easy to see which test (other than the lowest 
one) to select for a zero loading. But really it makes little 
difference, provided the differential rotation is not great, because 
within reasonable limits the loadings of the tests within each 
factor remain in the same relative order so that the interpretation 


MULTIPLE-FACTOR ANALYSIS 273 


of the outcome is unaffected. But if the worker wishes to try 
the more flexible graphical method of rotation as a hint of which 
test to accept as the one with zero loading, he can get directions 
for this process from the books by Guilford and by Thurstone, 
which are listed in the bibliography at the end of this chapter. 
However, no matter what method of rotation is employed, we 
cannot determine whether or not there is a general factor common 
to all the tests; the indeterminism of the methods of rotation 
forestalls that, unless the general factor is very prominent. 

The foregoing account shows how arbitrary are the arithmetic 
loadings when conditions cannot be imposed that determine a 
unique solution. They are equally arbitrary by our algebraic 
method and by the Thurstone geometrical method. As a matter 
of fact, the exact arithmetic weightings are not in themselves 
important; what we want to know is which tests go together as 
possessing the ability to measure a certain one or more of the fac- 


Taste XX.—ORIGINAL Factor LOADINGS (BEFORE ANY ROTATION) FROM 
Nine Tests INTENDED TO Prepicr ACADEMIC SUCCESS 


Factors 
Tests 
1 2 3 4 h? 
1. Number completion...... —.359 | +.360 | —.124 | .438 
2. English usage..........-- —.289 | —.102 | —.194 | .165 
3. Scientific information. —.312 | —.299 | —.189 | .292 
4. Arithmetic problems. .... —.268 | +.087 | +.394 | .325 
5. MacQuarrie block........ +.423 | +.282 | —.079 | .563 
6. Thurstone-Jones sketching] +.531 | +.142 | +.150 | — 80 331 
7. Thurstone-Jones cards....| +.550 | +.805 | +.084 | —.114 | .416 
8. Detroit pulleys........-- 4.364 | +.103 | —.324 | —.126 | .264 
9, Minnesota form board....| +.354 | +.312 | —.181 | +.342 | .357 


tors. For purely survey measurement purposes it would be 
satisfactory that a given test stand relatively high on all factors. 
But for diagnostic purposes it is desirable that we find tests which 
are high in ability to measure one of the factors while being very 
low (ideally zero) in weightings in the other factors. In the 
geometric system this is sometimes studied by plotting the 
positions of the tests on a plane if there are two factors or on a 
sphere if there are three—sticking hatpins in a ball and studying 
their relation to the spherical triangle generated by the 90-deg. 
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central angles between the veetors. Beyond three factors the 
process cannot be carried graphically. But, as a matter of fact, 
all these relations show up by merely inspecting Tables XX and 
XXI, or Tables XVII and XIX. We want, for a diagnostic 
battery, tests which agree in being high on one factor and low 
on the others. Failing this, we select as nearly as possible 
according to this principle. If the system lends itself well to this 
selection, our task will be a straightforward one. Under these 
circumstances in the geometric form the tests plotted on a hyper- 


Tasty XXI—Facror Loapines on THE Samu TESTS AFTER 
Turee ROTATIONS 


Factors 
Tests 
1 2 3 4 he 

1. Number completion, .... . +.433 | —.128 | +.368 | +.314 | .488 
2. English usage............ +.370 | +.068 | —.076 | +.133 | .165 
8. Scientific information... .| +.407 | +.241 | —.186 | +.184 | .292 
4, Arithmetic problems... . —.047 | +.024 | +.138 | +.550 | .324 
5. MacQuarrie block. ...... +.018 | +.325 | +.669 | —.101 | .564 
6. Thurstone-Jones sketching) +.167 | +.292 | +.462 | +.072 | .332 
7. Thurstone-Jones cards....| +.111 | +.413 | +.481 | —.046 | .416 
8. Detroit pulleys.......... +.166 | +.487 0 0 | .265 
9. Minnesota form board....) —.308 | +.435 | +.213 | +.166 | .357 
Se | 


surface would prevailingly fall around the vertices of a hyper- 
polygon (of a plane triangle if two factors, of a spherical triangle 
if three, ete.). The problem would then be said to exhibit 
“simple structure.” If a test or tests were not high in one 
factor and low in the others but were moderate in several or in 
all, then such test would not fall at the vertex, or even along the 
side, of the hyperpolygon but somewhere within the polygon; 
then the system would lack simple structure. But such relations 
can also be sensed directly from the table of weightings; indeed, 
with a little practice and insight into trigonometry, one can soon 
become quite adept at picturing just how the tests would fall if 
corrected for uniqueness (weightings divided by A in the table of 
original values) and plotted on a hypersurface. 

These observations prove the secondary character of the whole 
process of rotation, The configuration of points representing 
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the placement of the tests on the hypersurface would be pre- 
cisely the same if plotted from the original loadings (Table XVII 
or XX) as if plotted after any kind and amount of rotation. 
For this configuration of points is determined by the intercorre- 
lations among the tests, and those are given in the data and can- 
not be changed. We get different arithmetic weightings only 
by looking “down” upon these points from different positions, 
and hence getting different “projections.” Inspection of 
Table XVII in relation to Table XTX and of Table XX in relation 
to Table XXI will reveal substantially the same story before 
and after rotation. Especially in all except the first factor the 
tests relatively high before rotation are prevailingly the same as 
those relatively high after rotation. In Tables XVII and XIX 
the r between original loadings and those after rotation is .71 in 
factor 1; .93 in factor 2; and .86 in factor 3. It is factor 1 that 
suffers most from failure to rotate, As it stands, it is likely to 
be deceptive as to the extent of the presence of a common factor, 
In Table XX it looks as if all the tests have a common factor 
(factor 1), but that disappears in Table XXI, after rotation. 
We, therefore, recommend rotation as facilitating interpretation, 
but we point out that its function is a secondary one. 

After analyzing a correlational matrix for its factors, it is 
natural to try to interpret the meaning of these factors, This 
must perforce be a speculative process. We observe which 
tests are high in a factor and which low, and in which factors 
each is high and each low, and then try upon this showing our 
hypothetical interpretation. In Mercer’s study, involved in 
Tables XX and XXI, factor 1 (Table XXI) looks like a verbal 
academic-information ability, since English usage and scientific 
information play up high in it and the visualizing and manipula- 
tive tests have low weightings. Factor 2 is clearly ability to 
deal with visual space relations. Factor 4 seems to be mathe- 
matical problem-solving ability, since it is very high in arithmetic 
problems, moderate in number completion, and low in the 
visualizing tests. Factor 3 is harder to name; it is high in the 
MacQuarrie block tests and moderately high in number com- 
pletion and in the Thurstone-Jones sketching and card tests. 
Perhaps it is an ability to grasp rational space relations. But 
what was said above about the possibility of reversed signs in 
some of the factors might upset these interpretations. 
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It must be remembered that the factor loadings from a par- 
ticular sample are subject to considerable uncertainty on account 
of unreliability of measurement, especially with the higher 
numbered factors, although reliability formulas for the factor 
loadings have not yet been developed. On account of the prob- 
ability of fluctuations in weightings of particular tests from 
sample to sample and the uncertainty about signs, speculation 
as to the nature of the factors must be regarded as highly 
tentative. 


THE RELATION OF FACTOR ANALYSIS TO A CRITERION 

The factor analysis technique, as employed above and as 
usually employed, suffers from the lack of a criterion. Only 
such factors emerge as are entered in the battery. They are 
thus entered because they are alleged to be measures of a cer- 
tain function. So the analysis shows only what the tests have 
in common, not necessarily what are the factorial components of 
the trait alleged to be measured. By contrast with this, the 
multiple-regression technique gets weightings for factors in 
relation to their importance in the team in predicting a criterion 
(see pages 220 to 230). It would be feasible to put a criterion 
in with the battery in multiple-factor analysis; then it could be 
discovered how largely the criterion itself is weighted with 
each of the factors; and those factors could be made the basis 
for selecting tests with which the criterion is heavily weighted." 
As Exercise 3, page 278, we give Mercer’s criterion r’s (the r 
of each test with academic achievement). We suggest that 
the student add these as a tenth row and a tenth column to the 
correlational matrix and see with what factor loadings the 
criterion (academic success) emerges. + 


THE HOTELLING METHOD 
The Thurstone centroid method, which we set forth in this 
chapter, has a long lead in practice over any other method. At 
the time this book goes to press it has been used in research 
applications probably a hundred times as frequently as any rival 
method (excluding the older Spearman tetrad-difference tech- 
nique). But some people think that the method developed by 
1 At the suggestion of the senior author Henry L. Sisk did this. See “A 


Multiple Factor Analysis of Mental Abilities in the Freshman Engineering 
Curriculum,” J. Psychol., Vol. 9, pp. 165-177 (1939). 
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Hotelling and furthered by Kelley may prove in the end to be 
superior. The exposition of the Hotelling method is given by its 
authors in terms of hyperspace geometry, just as Thurstone’s is. 
We judge that the arithmetic work is roughly the same in both 
methods. But the Hotelling method calls for no rotation of axes, 
for which reason its solutions are unique. Furthermore, some 
headway has been made in deriving standard-error formulas for 
the Hotelling factors. But as yet there is no certain evidence 
upon which to base a choice. Thomson says on this point: 


It will be seen from these first chapters that the different systems of 
factors proposed by different schools of “‘factorists” have each their own 
advantages and disadvantages, and it is really impossible to decide 
between them without first deciding why we want to make factorial 
analyses at all. 


We cannot here take space to discuss a second method. The 
interested reader is referred to Thomson’s semipopularized 
account of the several methods and to the publications by Hotel- 
ling and by Kelley in the bibliography at the end of this chapter. 


Exercises 


1. By the tetrad-difference technique determine whether or not there is an 
element common to all the measures involved in Table IV, pages 58 to 61, 
and determine the correlation of each test with this common element. Ven- 
ture an interpretation as to what this common clement is. 

2. The table (page 278) from a dissertation by Margaret Mercer, gives a 
set of intercorrelations among certain tests presumably related to success 
in the Engineering School of the Pennsylvania State College. Find how 
many factors are represented, calculate the factor loadings, and compare 
your findings with those given earlier in this chapter. 

3. The following are the correlation coefficients of each of the above tests 
with academic success (grade-point averages) in the Engineering School. 
Put these criterion scores in the matrix as a tenth row and a tenth column, 
recompute the loadings, and see which tests are loaded with the same factors 
with which the criterion is heavily loaded, Interpret, 


1But E. B. Wilson and Jane Worcester challenge the psychological 
gfulness of the Hotelling factors. See “Note on Factor Analysis,” 


Psychometrika, Vol. 4, pp. 133-148 (June, 1939). 


meanin, 
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Tasim XXII.—IĪNTERCORRELATIONS AMONG Nine ABILITY MEASURES 


Test 1 2 3 4 5 A E | 8 9 
1. Number com- 

pletion........ == .158| .144|.279| 205). 144|. 214|. 083| — . 089 
2. English usage . 2153) — . 196] .030} — .029} . 100) .024). 020| — . 056 
3. Scientific infor- 

mation........ 144| .196) — |.109)—.085) . 158) .002). 194 010 
4, Arithmetic "i 

problems...... -279| .030) + .109) — 058) .053).027).021) .195 
5. MacQuarrie 

block., Hee 205|— .029/ — .085].058} — |.426/.412).262 234 
6. Thurstone- 

Jones sketch- 

ING sears tid wae 144) .100) .158/.053| .426| — |.317|.006 227 
7. Thurstone- 

Jones card....|  .214/ .024| .002/.027) .412|.317} — |.245) .269 
8. Detroit pulleys} .083) .020) .194.021| .262).006).245) — .179 
9. Minnesota 

form board. ...|—.089/—.056| .010).195| .234|.227|.269/. 179| — 
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CHAPTER X 
THE NORMAL PROBABILITY CURVE 


Derivation of the Formula.—In the algebra of chance it is 
shown that if each of n independent events has p chances to 
occur and q chances to fail, the total combinations of successes 
and failures is prophesied gi the binomial expansion 


Be ae) qn? + noe +p" (159) 


q+ py =a tng ip +o 
where the exponent of the p expresses the number of successes 
and that of the q the number of failures, and the coefficients 
represent the relative frequencies with which each of these com- 
binations of successes and failures is likely to occur. Any 
particular term in the expansion represents the probability of 
the occurrence of the number of successes indicated by the 
exponent of p in that particular term. If p and q are equal, 
indicating an equal probability of success or failure (half the 
times the chance of success, the other half of failure), the binomial 
becomes 


GH - YO OOO 


nor /4\r n 
Since ©) 9) = () , this expression obviously becomes 


Ga) Ql anG)earQe +6) 


There is much in genetics to suggest this as descriptive of the 
operation of determiners in controlling growth. Determiners 
in the body cells are inherited from the two parents, and, on the 
law of chance in mating, in the long run a determiner for the 
presence of a trait more favorable than the average is equally 
likely to be present or absent. When present, such a determiner 
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contributes something toward the characteristics we measure 
as success—height in a cornstalk, quickness of reaction, intel- 
ligence, or academic success in a pupil. It is plausible enough 
that these composite characteristics may be the outcome of 
the operation of many determiners for elemental constituent 
traits obeying singly the principle of equal probability of presence 
or absence and obeying jointly the principle of chance described 
by the binomial expansion. In like manner, behavior that is not 
the expression of the chance combination of elemental traits in 
the reaction of a biological organism but is the product of a 
combination of elémental factors or forces, each unit of which 
obeys the laws of chance, or behavior that is the result of an 
aggregation of constituent units each of which is determined 
according to laws of chance (as groups composed of the sort 
of individuals we have been discussing) may be expected to 
conform to this same principle of chance combinations as 
expressed in the binomial expansion. 

At any rate the composite traits or conditions we measure 
in educational statistics and in most other statistical applications 
arrange themselves with remarkable frequency in distributions 
that conform to this principle—which we call normal distribution. 
Consequently, the assumption of normality of distribution 
underlies much of our statistical work. The curve of a normal 
distribution is a peculiar bell-shaped one with which all students 
are already familiar. It will be the essential burden of this 
chapter to prove that the formula for the curve is 


z 
eZ 


N 
ov 2r 
Since the normal distribution is of such fundamental impor- 


tance`in statistics and since the student makes so much of its 


mathematical properties, the reader will wish to see a develop- 
ment of its equation. 


From our previous discussion of the binomial expansion we 
have seen that the successive terms, 


Oa 2G) o 
a e a O ane Ni 
TALIS rear Gy. 19 


y= 
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represent the probabilities of the occurrence of 0, 1, 2,3, . . . s, 
. , OF n successes, respectively. The last factor of the 
factorial expression in the denominator of each term represents 
the number of successes predicted by that particular term. 
We shall refer to these binomial terms as ordinates and to 
the corresponding numbers of successes as scores. Our problem 
then becomes that of obtaining an equation which will express the 
dependency of ordinate upon score value. 
In Fig. 19 we have plotted “number of successes” along 
the horizontal axis and corresponding “probability of success” 


Fic. 19.—Binomial frequency polygon. 


along the vertical axis. The extremities of successive ordinates 
are joined in order to obtain the resulting frequency polygon. 

The Y measurements are ordinates corresponding to the X 
measurements which stand for the different score values or 
number of successes. For example, ys represents graphically 
the probable frequency of the occurrence of the score value zs. 

For the sake of clearness we shall list our set of scores together 
with their respective probabilities in the ordinate-abscissa 
notation. Our set is composed of 


Yo = G) , the probability of a score of value 0, or xo 
yn=n G) , the probability of a score of value 1, or x1 
n(n — 1) (1\" a 
y=! NB) the probability of a score of value 2, or x2 


ys = Boe G) , the probability of a score of value 
3, Or %3 


RA lens) (A 


uP T2-3---8 Z 
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the probability of a score of value s, or z, 


Yn = 0 G » the probability of a score of value n — 1, or £a—ı 


Yn = (5) , the probability of a score of value n, or £n 


Unless the reader is skilled in the manipulation of algebraic 
expressions, he may perhaps find difficulty at first in understand- 
ing how the expressions for ys was obtained. Notice that s is a 
generalized expression for any y subscript; it may stand for 
any value of x and thus serve to designate the group of scores 
corresponding to that value of x. Observe, for example, the 
form of the expression for ys. In this instance, s = 3. More- 
over, we notice that the factorial expression in the denominator 
terminates with 3. The last factor in the numerator is (n — 2), _ 
which is precisely (n — 3 + 1), in which s has been replaced 
by 3. Since the expression gives the desired quantity for any 
particular chosen value of s, we are led by induction to the general 
expression for ys. (The reader should verify the expression for 
s = 4, 5, 6, ete.) 

The general expression for y, can be written in a more simplified 
form by multiplying both numerator and denominator by 
the quantity 


(n — 8)(n — s — 1)(n — s — 2)(n — s — 3) +++ 3-2-1, 
which is, of course, (n — s)! The equation for y, becomes 


A anin- 2) +++ (n—8+1)(n—s)(n—8 — 1) 
. (1-2°3+ ++ s)\(n—s)(n—s— 1) 


se Qs Ded 4) 
PAG ZNZ 


i a aad j [Probability of a score s in the 
i si(n — s)! A binomial (} + 4)"] (160) 


which in terms of factorials may be written 


Now suppose that in our development n is very large and for 
convenience is even and equal to 2r. Then the probability of 
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the occurrence of exactly r successes would be y, and would be 
expressed by the equation 


If) 2r! Bid 
Yr = rr 


in which n has been replaced by 2r, 


2r! (1\*" [Probability of obtaining exactly r A 
Yr = Tr G successes in the binomial (} + 4)*] (161) 


2 


The task of evaluating the expression for y, becomes laborious 
for even small values of r when substitution is made directly 
into the formula. A good approximation can be obtained, how- 
ever, by using Stirling’s approximation formulas for factorials. 
Stirling’s formula is as follows: 


(Stirling’s approximation (162) 


nl = e"n" (2r)? (Approx.) formula for factorials) 


We shall apply Stirling’s formula to evaluate Eq. (161). 
Making the substitution and remembering that in the numerator 
of (161) 2r must be used as the n of the approximation formula 
and that in the denominator we use 7, we find 


3 Ler anH@nt  (1\" 
Ye = err Im) herr (2r) \2 
Upon simplifying by canceling terms that are common to both 


numerator and denominator, we are left with the formula 


ty [Approximate probability of obtaining exactly z 
y= Grr)? r successes in the binomial (4 + 4)”] (163) 


Equation (163) gives a very good approximation to the prob- 
ability of obtaining exactly r successes out of the range of 2r 
scores. It denotes, therefore, the ordinate at the mean of the 
binomial distribution since we are taking n = 2r. 

In developing the equation for the normal curve, we are seeking 
an expression that will hold for all points in the distribution of 
scores on either side of the mean value. If we let æ be the dis- 
tance from the mean of the distribution to any other given 
point, then r + x or r — x will represent the score whose prob- 
ability or ordinate value we are looking for. This amounts to 
finding an expression for exactly r + % successes or a score of 
r + in the binomial distribution, 
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Substituting r + z for s and 2r for n in the general formula 
(160), we have 


2r! Ai 
H= GEA eF]! G) 


or 
or! 1\?” [Probability of a score z 
poe eS units from the mean in (164) 
(r + a) — z)! \2 the binomial (} + 4)2"] 


We now come to the task of evaluating Eq. (164). Before 
applying the approximation formula, it is convenient to rearrange 
the form of the equation by multiplying and dividing the right- 
hand member by r!r! We make this change and write 


Lila E G) | + ea z)! 


The expression within brackets is the same as Eq. 161 which we 
have already found to be 1/(rr)è. 


Hence, 
. 1 rir! 
(4) = GG tale — al 


The factor T is evaluated by applying the Stir- 
ling approximation formula, The task is rather long and 
involves detailed simplification. It is left to the student as an 
exercise in algebraic manipulation. It is sufficient to say here 

z1 
that its value is approximately e ". Substituting this value in 
(A) we have as our approximation formula for the probability of 
the occurrence of a deviation x in the binomial distribution 


G +4)" 


(B) Y= ou er 


f 2z? 


‘Hive: Take logarithms of the expression, use Stirling's approximation 
formula for the factorials, simplify, and then take the antilogarithms. 
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which reduces to 


Tale [Approximate probability of a score x 
i= = en units from the mean in the binomial (165) 


G@ +a] 

We show on page 298 that for a point binomial o? = npg. In 
this development p = q = 4; so that o = n/4, or n = 40°. 
Substituting 40? for in Eq. (165) and dropping the subscript 
from y to denote that we shall assume the formula to hold 
continuously throughout the range of x values, we have 


or aa 
| ee 
Y \ rår & 


2 
1 ow (Approximation equation for the point (166) 


y= a binomial) 


which reduces to 


Formula (166) expresses the probability, within the limits of 
Stirling’s formula, for the point binomial for n very large. 
Mathematicians often make a more direct approach to the 
normal curve equation by setting up a differential equation of 
the form 
dy _ 
TER. Cay 
which satisfies certain conditions which we know to be true for 
the normal curve. The integration is performed as follows: 


yim 

ae Czy 
Separating the variables, 

W Cr dé 

y 


Integrating, 
2 
logy = —C 5 +K 


Solving for y, 
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As was shown in the previous edition of this book (pages 231 to 
234), 


A= 


1 and = 4 
OV m T 
whence 


at 


e 207 (Normal probability function) (167) 


{ ov 2m 
This is precisely the same as formula (166). We see, there- 
fore, that our approximation formula, (166), expressing the 
point binomial probability is precisely the formula for the normal 
curve. 
The right-hand member of formula (167) can be factored as 


1/1 -5 
= e 
a \ V 2r 
in which the quantity within parentheses is usually denoted by 


the letter z. Values of z have been tabulated for various values 
of x/oz. But from the above expression it is evident that 


(0) y=1z 


So if our distribution has unit area and unit standard deviation, 
y =z and the z values are merely the ordinates of a normal 
distribution of unit area and unit standard deviation. The 
equation for z would, of course, be 


(D) geal 


If N is the total area under the curve, instead of unity, the 
equation of the normal curve becomes 


es 
y= ae e ** (Normal probability curve of area N) (167a) 
Equation (167a) follows from the fact that the area obtained by 
integration would be N times that obtained by integrating (D), 
which we shall see is unity. 


THE NORMAL PROBABILITY CURVE 287 


PROPERTIES OF THE NORMAL PROBABILITY CURVE 


Modal Ordinate.—At the origin, that is when x = 0, formula 
(167) becomes 


1 1 
—— (Fore? = 1 
d 2mo vV 2x0 Raki ) 
This tells us that m is the value of the y ordinate at the middle 
amO 


of the distribution because we have measured our « deviations 
from this middle point. If we designate this modal ordinate by 
yo, the normal curve equation may be written 


(E) y= ye ® 
Area under the Normal Curve.—Since we began the develop- 
ment with the expansion of the binomial (¢ + 4)”, the sum of all 
the ordinates of the binomial is unity; for ($ + p= (1) =1. 
This means that the curve whose equation is (166) should by 
analogy obey the condition 
+2 eit 2 
Í l emdr = 1 


-e V2ro 


i.e., the area under the curve from a distance infinitely far to the 
left of the y axis to a distance infinitely far to the right of the 
same axis should equal unity: ; 

The z? term in the integrand assures us that the curve is 
symmetrical with respect to the y axis; for no matter whether « 
is positive or negative, its square must necessarily be positive. 
Thus the area under the curve between the limits — © and +% 
is the same as twice the area under the curve between the limits 
Qand +œ. Letting A denote the area under the curve, we may 


write 
a 


A wai hide de 
0 V0 


is the height of the ordinate at the origin and 


Since 
T 


represents, therefore, the mode of the distribution, it is a constant 
quantity in any given distribution. We may, consequently, 
remove it from beneath the integral sign without affecting the 
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integration involved. Then our expression for the area may be 


written 
2 $a a 
A= ei dg 
Vino Í 


It so happens that the evaluation of the integral appearing in 
this last equation involves the application of certain advanced 
mathematical functions which, if introduced at this time, might 
serve to confuse the reader. It is listed among the standard 
types in most integral tables and its value is given as ~/2m0/2. 
We see that this value is the reciprocal of the coefficient of the 
integral itself in the equation directly above. The product of 
the two quantities, of course, is unity; and we see that A = 1. 

Standard Deviation of a Normal Distribution—We have 
observed elsewhere that in the case-of a finite number of dis- 
crete variates the standard deviation is defined by the relation 
a? = Yx?/N, where x denotes a deviation and N the number of 
scores. The analogous definition in the case of the continuous 
normal distribution which extends infinitely far to the left and 


+e 
K ik yada 
to the right of the mean is o? = Seat where y denotes the 
predicted frequency ( = oe e) of the deviation x, and 
TO 


N denotes the total number of deviations—area under the curve. 
Mean Deviation of the Tail of a Normal Distribution.—One of 
the many important properties of the normal curve involves the 
expression for the mean deviation of a truncated portion of a 
normal distribution. It will be of value, therefore, to develop a 
formula for this quantity. We shall deal with the normal 
distribution of unit area and unit standard deviation. 

Let d represent the mean deviation, g the proportion of cases 
in the distribution from the point of truncation x,onto ©. The 
ordinate value of any point in this section of the area will, of 
course, be z. Then, since by the definition of a mean we must 
sum all deviations and divide by their number, our problem of 
finding d will be that of evaluating the expression 


ge te 
qd 
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When we replace the z appearing under the integral sign by its 
equal given in (D), our expression for d becomes 


E as 
—e rdr 
fz 
q 


In order to facilitate the integration involved in the above 
equation, let us insert a minus sign before the x appearing as one 
of the factors under the integral sign and compensate for this 
change by inserting another minus sign before the integral sign. 


Then 
Sa -Í Jg e ] 
q 


Aside from the constant 1/+/2x, which may be taken outside 
the integral sign and which does not enter into the integration, 
our integral to be evaluated between the limits zı and © is of the 
type form fedu, in which form u = —2*/2 and du = —2 dz. 
Now we know that fe“du = e" (see page 35). Hence, 


De 


pile 


Upon evaluating the quantity in the numerator between 
the designated limits, we find that for the upper limit (%) the 


quantity approaches zero,! and for the lower limit (zı) the 
z3 


quantity obtained is (1/v/2r)e ? , which is the value of z (say 21) 
at the point xı. When the complex expression in the numerator 
above is replaced by its equal zı, we obtain, therefore, 


=” (Mean deviation of the tail of a 
Tig normal distribution) (168) 


q 
Mean Deviation of a Portion of a Normal Distribution.—We 
shall now develop a formula for the mean deviation of a portion 


1 The reader will observe that because of the negative sign appearing 
before the exponent of e the whole factor will approach zero with increasingly 
large values of z, because the e with a negative exponent in the numerator 
is the same as the e with a positive exponent in the denominator of a fraction. 
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of a unit normal distribution included between two designated 
points of division. 

Let qı be the proportion of cases lying beyond the point zı, 
and let qz be the proportion of cases lying beyond the point 22. 
Then the area (proportion of cases) between xı and 22 is qı — q2- 
Hence, from our definition of a mean we may write 


n fde s 


gi — g2 


We may, of course, replace z by its equal as given by (D) and 
rewrite the equation for d as follows: 


Ez] 1 20 
—=e ?rdr 
ge BE NED hy 
gM — qi 


We have already integrated the numerator of the above 
expression (page 289). Making use of this value, we may write 


1 222 
fee | 
Vir zı 
qi — Ye 
Substituting the upper and lower limits, we now have 


d= 


The individual terms of the numerator of this last equation are 
nothing more than zı and ze, respectively. Hence 


= a2), 
(qi — 4) 
or, if we call the proportion of cases lying in the sector between 
zı and 2a, g, then 


qetre (Bean deviation ot a portion of a normal 
pet z : 

a oy on ae area and unit (168a) 
where 21 is always the left-hand ordinate bounding the portion 
of the distribution and 2x is the right-hand ordinate. 
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THE VALUES OF £8, AND f: FOR A NORMAL DISTRIBUTION 


Throughout many of our developments we have had occasion 
to say that 6; = 0 and fz = 3 in a normal distribution. We 
shall now give the proof for these values of the 6’s. 

We have the following definition for £1: 


_ (228/N)? 


gê 


Bı 


In the summation of the x”s the frequency must, of course, be 
considered. Remembering that the frequency at any value 
of z is the corresponding y, we may express the above for a normal 
distribution as follows: 


1 (r+ _N Balt [ (71s Baal 
lal, EETA vis EE Pr iz] 
- gê g$ 


bı 


We are now led to the evaluation of the integral appearing 
in the numerator of this fraction. For this purpose we shall 
resort to the familiar “parts” method. We have 


Judv = w — fvdu 
(see page 38), where fudv represents the integral we are to 


zt 
e rdr, and let u = x°. Then, 


1 
te. Let dv = == 
evaluate. et dv T/a 


multiplying both numerator and denominator by ¢ and indicat- 
ing the integration, 


F ENA, 
fav == [ Te 203 = de 


The integral on the right is only a slight modification of the one 
already found. It is ; 


e 
V/m 
The differential for our expression for u is du = 2x dz. When 


we substitute the values of u, v, and du into the parts formula, 
Judi = w— fodu, the integral in question becomes, after 
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due consideration of sign changes and noticing that u dv is the 
expression in the £ formula above for which we are integrating, 


i L raaa eo harf ay. oF eae] 
-o 0S 2 Vr Vr 


We must now integrate the second term of the expression 
within the brackets. In order to do this it is convenient to 
make certain algebraic adjustments so that we may apply the 
same standard integral form we have just used. If we divide 
the x factor appearing under the integral by —o?, our integral 
becomes at once of the form fe“du. If the reader will perform 
the task, he will find that the second term of the brackets becomes 


-w 


-2 - 
CEET 
V or 


Now the value of the quantity at the left of the equation will 
be that value obtained by evaluating the two right-hand terms 
between the limits — œ and +. Since the first of these terms 
contains an even power of z, it will have the same value at +% 
and —, so that the value at the upper limit minus the value 
at the lower limit will be zero. Since the exponent of the e in 
the second term has the negative sign which would have the 
effect of putting it in the denominator with a positive exponent 
and leaving in the numerator the constant 203/+/ 27, the entire 
term will also approach zero for increasing values of x whether 
positive or negative. Therefore the whole numerator of the 
fraction to which £, is equated (called the third moment or ba) 
is equal to zero, since it equals the two members on the right of 
the equation which sum to zero. We would, therefore, have 


0)2 
Bi = o =0 (6: for a normal distribution) (169) 


In this development it has been incidentally shown that 
22°/N, called ys, equals zero in a normal distribution. It can 
be readily shown that every other odd moment of a normal 
distribution also equals zero. 

The following relation defines $3. 


mo t= S 
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Expressed in terms of the integral, this becomes, for a normal 
distribution, 


Bz = 
Canceling the N’s and clearing of fractions, ` 


+o ENC 
a'b: = Í l e Molde 


-e oI 


Apply the method of parts, letting u = zè and 
zi 
dv =e x dz. 
Then 


v= f a = -o fH Sas = —o% 2 
re 


Furthermore, since u = 2°, du = 3x%dx. Upon substituting in 
the parts formula, fu dv = ww — fvdu, and at the same time 
1 


on ar 
throughout the entire process, we have 


remembering that the constant factor is to be carried along 


N EaR -i 30? “Ke Pl li 
be [eb 88 ff 

We observe that the first term within brackets is of the order 
x*/e*, When we let x approach the infinite limits, this term 
takes on the indeterminate form ©/o, We must, therefore, 
resort to a method of evaluating indeterminate forms provided 
by the calculus, but which we did not discuss in our calculus 
chapter. This method consists in differentiating repeatedly 
the numerator for a new numerator and the denominator for a 
new denominator until finally a meaningful value is found, when 
substitution of limits is made into the last of the series of expres- 
sions thus obtained. The reader may easily verify that after ' 
differentiating three times in this manner we obtain the expression 


6 
12ze" + 8r” 
Since this expression contains a constant in the numerator, its 
value will be zero when z equals +% or —@. Hence the first 
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term becomes zero when evaluated between the limits — 
and +œ. Let us now examine the second term. It may be 
written 


wea! N -5 2, 
> — e % rdr 
a= NoV/2r 
The value of the integral is o°, so that the whole expression 


is equal to 304. Substituting this in our equation above for 
Boot and dividing through the equation by o*, we have 


30? 


j Boot = 30% 
bB E3 (2 for a normal distribution) (170) 


Since the numerator of our 62 equation was pu, it has followed 
from our development that, in a normal distribution, u4 = 304. 


POINTS OF INFLECTION ON THE NORMAL CURVE 


In the calculus it was shown that a condition for a point of 
inflection upon a curve is that the second derivative be equal to 
zero. If, then, we set the expression for the second derivative 
equal to zero, we are in a position to solve the resulting equation 
for the values of the independent 
variable which satisfy that 
condition. 

Now let us consider the equa- 
tion of the normal curve and 
investigate the values of x which give us the points of inflection. 
In the accompanying diagram, we wish to find the distance a. 
According to the argument given in the calculus, we must 
differentiate the equation of the normal curve twice and set the 
result equal to zero. Upon solving this final equation for x, we 
shall find those values which give the position of the point of 
inflection. The computation is as follows: 


Fra, 20, 


The equation y = eT =} is of the form y = Ae”, where 


N 
o/ 2m 
A is a constant coefficient and v is a variable exponent. Since 
the constant multiplier does not enter into the differentiation 
and since the derivative of the form e” is e*(dv/dx) (see page 24), 
we may write our first differentiation (dy/dx) = Ae?(dv/dz). 
The right-hand side of the expression for the first derivative is 
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seen to be a product of the two functions e” and dv/dx with the 
constant A again appearing as a multiplier. Remembering that 
the derivative of a product is equal to the first times the deriva- 
tive of the second plus the second times the derivative of the 
first, the reader will readily observe that the expression for the 
second derivative becomes 


dy. „dv ay 
Gy a [oe te (2) 
We must now set the expression for the second derivative equal 


to zero and solve for x in order to find the points of inflection. 
Before doing so, however, we must replace the derivatives 


-appearing in the right-hand member by their equals as deter- 


mined from the replacement of the letter v for the more complex 
exponent appearing in the normal equation. We have that 


v= — 


20? 

a = 15 (see page 7—differentiation of x”) 
a eas 
dey o* 


Making these substitutions into the expression for (d?y/dx*) and 
equating the expression to zero, 


at are) 2 
AOA]: 


The constant A may, of course, be divided out. Since the factor 
zt 
e 2 cannot equal zero for any finite value of z, it may likewise 
be divided out. Thus we are left with the following equation 
from which to determine the value of x that will be a point of 
inflection: 
os 


E ei 
Dividing both sides by 1/c? and transposing, 
a 
off 


Hence z? = o°, and x = +o 
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We have, therefore, discovered that the points of inflection 
of the normal curve lie at a distance of one ø to the right and to the 
left of the mean. 

Mean and Standard Deviation of the Point Binomial.—In our 
development of the formula for the normal curve, we began with 
the binomial expansion, the successive terms of which represent 
the occurrence of 0, 1, 2, ete., successes. We saw that for 
increasingly large numbers of events (the n of the exponent of 
the binomial), the distribution of successes approached nearer 
and nearer the normal form when p and gq are equal. Often, 
however, we are in a position to deal with distributions of which 
we know the p and the gq and the n as determined by sampling. 
It is convenient, therefore, to have a formula for the standard — 
deviation in terms of these quantities. 

Let us begin by making a table of our assumed scores. Since 
each term of the expansion represents the probability of the 
occurrence of a particular score, the frequency of each score will | 
be the product of the probability of its occurrence and N, the 
total number of scores. With this in mind, we may display our 
table as follows: 


From this table we obtain the mean and standard deviation 
of the distribution expressed by the point binomial by calculating 
the quantities given in the definitions of these measures. We 
have, upon adding the elements of column (fx) and dividing by N 
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We may factor out of each term in the brackets the quantity 
Nnp. Doing this, we may write 


zeae Nap |e (u — 1)q"~*p 
Mein [a Ait I 


+ BaD D rtp + eo sp] 


The expansion now left within the brackets is that of the binomial 
(q+ p)""! and therefore equals 1, for q+ p= 1. Hence, 
we have, after canceling the N appearing in both numerator 
and denominator of the fraction, that 


M=np (Mean of the point binomial) (171) 


From the definition of standard deviation 


= He _ (BY 
CREN N 


The second term in this expression becomes, in view of the 
development just above, n*p%. Our problem now is to find 
Dfx?/N. Adding column (fz?), and dividing by N, we write 


afat 
Ea = [Yurte +N 


Bn(n — 1)(n — 2) 
2! 


2 -1 
les ) qp? 


+N gripes bees + Nna 


Factor out of each term within brackets the quantity Nnp. 


Nnp — l)\(n- 2 
jit ie ) rtp? 


2 a No | et q =D get 4 2l 
Heete tap] 


The expression now appearing within the brackets may be 
written as the sum of two series by properly grouping certain 
terms and portions of terms. That is, we may write 
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(m= 1) 


| get ui 2 7 gp + 3(n — su = 2) grp? 
+t np =] iy {r+ (n 7 1) gp 
+ eae grip? +o + pit fe {on — 1)g"-?p 


peo gp tee + (a npr 


The expression within the first set of braces is the expansion 
of the binomial (q + p)"~! and therefore equals 1. We may 
factor (n — 1)p out of the second set of braces and write this 
expression as (n— 1)p{q"*? + (n — 2)qy*p +--+ + pè}. 
The expansion appearing within this final set of braces is that 
of the binomial (q + p)"~* and therefore equals 1. Thus we 
find the value of the expression given in brackets above is 
1+ (n—1)p. Hence, 


> 2 

PEE NPP + (nw — 1p] 
Zfx? 

WW = mpl + mp = p) 
He a np + n*p? — np? 


We are now in a position to substitute our values of 3fx?/N 
and (2fx/N)? into the formula for oê, Making these substitu- 
tions, we obtain 


o = np + np? — np? — n2p? 


Collecting, we have o? = np — np?. Factor the right-hand 


member, o? = np(1 — p). But since g+p=1, we have 
that 1 — p = q.. Therefore, o? = npg. And 
c = V/npq (Standard deviation of the point binomial) (172) 


Shape, Symmetry, Extent, and Slope of the Normal Curve.— 
We have already pointed out in our previous discussion that the 
normal curve is a symmetric bell-shaped curve whose slope at 
equal distances to the right and to the left of the mean is theo- 
Tetically always the same, Furthermore, the curve changes 
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from convex to concave at a distance of o on each side of the 
mean. 

An additional property of the normal curve is its approach 
to the x axis as we go out farther and farther in either direc- 
tion. This asymptotic approach may readily be demonstrated 
mathematically by considering the form of the equation and 
observing that because of the negative exponent of the term e, 
the equation may be written y = (N/ov/ m) (1/6 2). Now, 
as we take increasingly large values of x in absolute value (either 
positive or negative), the denominator of the fraction on the 
right involving x? becomes larger and larger, and as a consequence 
the right-hand member becomes smaller and smaller. This 
amounts to saying that y approaches zero as x increases without 
bound. Hence we see that, although for most practical purposes 
the normal curve is taken to include all the cases between 3.50 
or 4.0o in either direction from the mean, this is not the actual 
situation if a sufficiently large sample of the total normal popula- 
tion were available. 

The Principle of Least Squares.—In the development of many 
statistical formulas we have found it necessary at times to 
minimize the sum of the squares of errors resulting from the use 
of observed rather than the actual scores. Because this proce- 
dure plays such an important part in so many developments and 
because its application is so universal, the advanced student 
of statistics will wish to examine the principle upon which it 
rests. 

Let us assume that our errors make a normal distribution. 


Then the frequency of the occurrence of a particular error would 
2 


-07 an 
be given by the normal equation y = yoe 2. The probability 
of the occurrence of a particular error would be y/N. Hence, 
the probabilities representing the occurrence of the errors 
Zi, To, tg, . . . » En would be given by the quantities 

ae Ga he Oe Som 
yoo 7 ye 7 ye 7 | yoo # 
ENE EA CNG NE 
Now it is a fundamental theorem in the study of probability 
that the probability of the simultaneous occurrence of several 
independent events is equal to the product of the probabilities 
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of the several events, respectively. Hence in a normal distribu- 
tion the probability that the errors tı, £o, a, ... , En, will 
occur at the same time would be given by the product of the 
above quantities which represent the individual probabilities. 
If we denote the measure of this product probability by P, we 
shall have 

2 a < „3 
(F) pate woe F me me 8 

N N N N 


We may, of course, multiply by adding the exponents and write 


ype 2E tE tatta) 
G = oo 
@) p= 
Placing the exponential factor in the denominator and at the 
same time changing the sign from minus to plus because of the 
change, 
nN =n 
(H) P= epee on 
a aban ae eTa 


Equation (H) shows that the value of P is a fraction whose 
value depends upon the sum of the squares of the errors.’ Since 
the value of the fraction is greatest when the denominator is 
least, the value of P will be greatest when the quantity within 
parentheses is least. That is, we must minimize the sum of 
the squares of the errors in order to obtain a maximum probability 
of the concurrence of these errors. From this standpoint, 
therefore, we have a partial explanation of the principle under- 
lying the least-squares method used in many of our statistical 
developments. 

Normal Probability Tables—Their Construction and Uses.— 
Many interesting properties may be pointed out through an 
analysis of the construction and uses of probabilities tables. 
These tables usually give ordinate values which represent the 
probabilities of the occurrence of corresponding deviate values 
and integral values (areas) which represent the probabilities 
of the occurrence of a deviate within a given range, i.e., between 
any two particular ordinates. When the entire area under the 
curve is taken as 1, the area between any two designated ordinates 
is, therefore, the proportion of the total population falling within 
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that range. Some tables give additional integral values which 
show the proportions of the distribution to the left and to the 
right of a particular ordinate. 


We have seen that the equation of the normal curve of unit 
a 


area (see page 286) is y = e 27, where x represents the 


1 
oV/ 2 
deviation of a score from the mean of the distribution, o the 
standard deviation, and y the ordinate which gives the probability 
of the occurrence of the deviate x. We have designated the 


quantity (1/+/2m)e ™* by the letter z, and written 
1 
y = —2[Bq. (C), page 286] 


The values of z which correspond to different values of £/ø 
may be computed by direct substitution into the designated 
quantity above. For example, the value of z for the sigma 
unit z/s = 2 could be found by simply evaluating the expression 

(OH 
(1/V/2r)e ?. The value of this expression, as can be verified 
by simple arithmetic processes, turns out to be 0.05399. Tf the 
reader will turn to Table XLIV, he will find this value of z appear- 
ing opposite the number 2.00 which appears in the #/o column. 

At this stage it may be well to point out that the z values which 
appear in the tables are to be taken as ordinate values only when 
the o of the given distribution is 1. Otherwise, we must divide 
by the o of the distribution under consideration in order to 
obtain the correct ordinate for a particular sigma unit value. 

This fact is at once apparent from the form of Eq. (C). Hence, 
if we desired the probability of the occurrence of the sigma unit 
2.00 in a distribution whose standard deviation is 5.00, we should 
have for the corresponding ordinate value 


1 1 
y = 22 = pgp (05399) = .01080 


This tells us that we should expect the occurrence of the sigma 
unit z/s = 2.00 approximately once in every hundred random 
selections from our normal population. } 

In order to obtain the area under the normal curve between the 
mean ordinate ‘and any other chosen ordinate, it is necessary to 
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integrate the normal equation between the values of v/s which 
correspond to these particular ordinates. Analytically this 
means that, if we let A represent the required area, our problem 
is to integrate the following expression: 


= fecal, ae 
F A= e dx 
@ 0 0V 2% 
where the upper limit refers to a definite sigma unit value. 
In order to simplify matters, it is convenient to make the trans- 
formation t = a/o in Eq. (J). Our integral then becomes! 


DAAT iin 
7 Ap 2 dt 
W) kyz 


There is no general formula for the value of this integral. For 
this reason, mathematicians have turned to convergent series in 
order to approximate to its value for different values of ¢. The 
process consists in the termwise integration of the series obtained 

t 
by expanding the function e 2 and of then calculating successive 
approximations of A for successive substitutions of t values. We 
shall not take the space to justify the validity of the process in 
this development, and we shall omit the detailed development 
of the series. The following convergent series may readily be 
derived and may be employed to compute areas under the normal 
curve between the mean ordinate (y axis) and any other ordinate. 


( ERA PN AREN ARNE Gi AED 
Be Are a) - an va) +n (a) 
1 (_t\ 
pare o AE | 
To calculate the area A from the mean up to the sigma unit 
= 2.00, for example, we need only to substitute this value of t 


into the right-hand member of (K), using as many terms as are 
necessary to give the degree of accuracy desired. The student 


1In making the change, we see that since z/s = t, x =at, and thus 
dx =odt, This accounts for the disappearance of the e in the denominator 
1 


of the factor —a 
OV 2r 
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may easily verify that the area for this value of ¢ turns out to be 
0.47725, approximately. This means that about 48 per cent 
of the total normal population falls within the range between the 
mean and two sigma units. 

Because of the symmetry of the normal curve, the y axis 
divides the entire area into two equal parts. Therefore, when 
the total area is taken as 1, the proportion to the left of any 


Tasun XXIII.—Arwas AND OrpINATES UNDER THE Norman CURVE IN 
TERMS or ABSCISSAS 


Area hea Sum of 
Abscissa between | Area to | Area to | between sii ot 
z SAHE left of | right of | ordinate or i Ordinate 
(2): ordinate | ordinate | ordinate THEN + ona T f 
() and on at z at = f to Éftiof 
nate at A anid + “2 


0.0000 0.0000 | 0.5000 | 0.5000 | 0.0000 | 1.0000 | 0.3989 
0.5000 0.1915 | 0.6915 | 0.3085 | 0.3829) 0.6171 0.3521 
0.6745 0.2500 | 0.7500 | 0.2500 | 0.5000 | 0.5000 | 0.3178 
1.5000 0.4332 | 0.9332 | 0.0668 | 0.8664 | 0.1336 | 0.1295 
2.0000 0.4772 | 0.9772 | 0.0228 | 0.9545 | 0.0455 | 0.0540 
4.0000 0.5000 | 1.0000} 0.0000 | 0.9999 | 0.0001 | 0.0001 


——_—_—_ aaa lMaaaaMMaaaaaaŘħÁĖě 


ordinate on the right of the y axis may be obtained by adding 
0.50000 to the value of A for that particular ordinate. If the 
ordinate in question lies to the left of the y axis, we must subtract 
the value of A from 0.50000 to obtain the left-hand portion of the 
area under the curve. On the other hand, if we desire the area 
to the right of an ordinate, we must subtract the value of A from 
0.50000 for ordinates to the right of the y axis and add the value 
of A to 0.50000 for ordinates to the left of the same axis. The 
area between —a/o and +2/o (i.e., between ordinates equally 
spaced to the left and to the right of the mean) is twice the yalue 
of A. In any ease, the proportions cut off by any particular 
ordinate or ordinates can be computed directly from the value 
of A. Table XXIII displays areas and corresponding ordinates 
' for a few values of the abscissa a/c. 

All entries in the table have been computed on the basis of 
unit area. If, for example, N = 1,000, then 1,000 times the 
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entries of areas gives the total frequencies for those columns. 
Thus for the case z/s = 2.00 in a distribution of 1,000 scores, the 
probable frequency between the mean ordinate and the ordinate 
at two sigma units from the mean would be 1,000 times 0.47725, 
or 477.25. This tells us that we should expect approximately 
477 cases out of 1,000 to fall within this range. 

The reader will observe that one-fourth of the total area 
is included between the mean ordinate and the ordinate at 
0:67449 sigma units from the mean; i.e., where z/s = 0.67449. 
Multiplying both members by ø, « = 0.674490, the value of a 
deviation which marks off one-fourth of the area—that fourth 
which lies on either side of the mean. Thus we see that half the 
total area lies between deviations which are at a distance of 
0.674490 from either side of the mean. We may conclude, 
therefore, that, if a deviation is chosen at random from a normal 
population, the chances are even that it will lie within this range. 
This range is commonly called the probable error. It is written 


P.E. = 0.674490. 


Graduation of Data to Normal Distribution.—The problem of 
graduating a given group of scores to a normal distribution 
properly belongs to the study of curve fitting. Nevertheless, 
it is well to consider the problem at this time in order to throw 
further light on the use and interpretation of normal probability 
tables. The task of adjusting a normal curve.to a given distri- 
bution is one of passing a smooth curve through the upper 
extremities of theoretical ordinates (those taken from the tables) 
which are found to correspond to actual sigma values of the 
distribution at hand. It is usually advisable to make a list of 
those items which are necessary for purposes of computation 
before plotting the actual frequency polygon and superimposing 
the resulting theoretical curve. The graph itself gives us a 
visual impression of the goodness of fit; but, we are very often 
led to a more statistical test.2 


The following data for observed frequencies are the scores 
obtained by 149 sophomores at Pennsylvania State College on 
the 1930 Carnegie Foundation Tests (Professional Education). 

1 See p. 417 for the x? test. 
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TABLE XX1V.—Sornomore SCORES on THE PROFESSIONAL EDUCATION 
SECTION OF THE CARNEGIE Founparion Tests, 1930 


Score intervals| Frequency 


119.5-139. 
99.5-119. 


319.5-839,5 3 
299.5-319.5 5 
279.5-299.5 8 N = 149 
259. 5-279.5 12 
239 .5-259.5 19 
219.5-239.5 26 M = 215.4 
199.5-219.5 22 
4 179.5-199.5 18 
159.5-179.5 14 o = 50.9 
139.5-159.5 6 
5 
5 


Our problem is first of all to determine theoretical normal 
curve frequencies for the intervals appearing in the table above. 
To do this, we must find the areas under the normal curve which 
correspond to these intervals and then multiply by 149. We 
simply find the area from the mean up to the upper boundary 
of the interval, and subtract the area which lies between the 
mean and the lower boundary of the same interval. In this way 
we find the theoretical frequencies for all the intervals. 

Consider, for example, the interval (319.5-339.5). Since the 
mean is 215.4, the same interval in deviation form becomes 
(104.1-124.1). When we divide by the value of ¢ (50.9), the 
interval in sigma units becomes (2.04-2.43). From the normal 
probability tables (pages 485 to 487) we find that the area from 
the mean up to 2.43 sigma units is .493, and from the mean up to 
2.04 sigma units is .480. Subtracting, we find the area included 
in the interval to be .013. We conclude that we may expect 
approximately 13 cases out of 1,000 cases to fall within the score 
interval (319.5-339.5). Since in our case N = 149, the pre- 
dicted frequency would be, therefore, .013 +149 = 1.94, or 
approximately 2. The reader will observe that actually 3 cases 
fell within this group. 

We proceed in this manner to make a table of the gradua- 
tion data for all the intervals. These data are displayed in 


Table XXV. 
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Taste XXV.—Normat Curve GRADUATION Dara For 149 SopHomorn 
Scores on THE Carneaim Founpation Tests, 1930 _ 


Inter: | Devia- | Devia- | ‘sre | Portion |Theoret-| Actual $ 
val tion tion f Differ- 
i up to ø | of area | icalfre-| fre- 
waste ean me) unit | between} quency | quency ries 
aries | mean | units 
339.5 | +124.1) +2.43 493 
-013 2 3 -1 
319.5 | +104.1| +2.04 -480 
029 4 5 -1 
299.5 | + 84.1) +1.65 451 
055 8 8 0 
279.5 | + 64.1) +1.26 396 
091 13 12 +1 
259.5 | + 44.1) +0.86 -305 
.125 18 19 -1 
239.5 | + 24.1) +0.47 .180 
149 22 26 —4 
219.5 | + 4.1) +0.08 031 
215.4 0.0) 0.00 -000 .155 23 22 +1 
199.5 | — 15.9] —0.31 124 
134 20 18 +2 
179.5 | — 35.9) —0.70 +258 
-106 16 14 +2 
159.5 | — 55.9] —1.10 3864 
068 10 6 +4 
139.5 | — 75.9) —1.49 432 
038 6 13 =f 
119.5 | — 95.9] —1.88 -470 
018 3 3 0 
99.5 | —115.9| —2.27 -488 
———— ——— A — 


The area for the interval which included the mean 
(199.5-219.5) 


was obtained by adding the areas from the mean out to either 
extremity of the interval. The Difference column in the table 
may be used to make rapid adjustments in the frequency polygon 
and thus to give the points through which the smooth theoretical 
curve is to be drawn. 
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Figure 21 shows the frequency polygon of the sophomore 
scores, together with the superimposed theoretical normal 
curve. 

The Ordinates Method.—The normal curve of Fig. 21 was 
drawn through the points that represented the theoretical 
frequencies in each interval. ‘Another very simple method of 
plotting the curve is to erect ordinates at given distances along 
the x axis and to pass the curve through the upper extremities 
of these ordinates. Usually the practice is to start at the mean 
and erect an ordinate at each half sigma in each direction until 


99.5 119.5 139.5 159.5 179.5 199.5 219.5 2395 259.5 279.5 299.5 319.5 339.5 


Fra. 21.—Normal-curve graduation of 149 sophomore scores on the Carnegie 
Tests, area method. 


five or more ordinates are found on either side of the mean. 
With the frequency polygon already drawn on a scale that is 
marked off along the x axis for scores and on the y axis for fre- 
quencies, it is easy to determine graphically where the ordinates 
should be erected. The method is simply that of graphing the 
equation of the normal curve, which we have seen (see page 286) 


may be written y = x rs 
In our problem N = 149, and in terms of intervals 
o = (50.9 + 20) = 2.5 


N +o = 149 + 2.5 = 59.6. The equation whose curve we 
wish to plot may thus be written y = 59.6z. The value of z 
for each value taken along the z-axis may be calculated by 


af 
logarithms from the equation z = (1//2r)e ™* or more easily 
from our z table.! The normal-curve ordinates at the mean, 


1 Appendix, Table XLIV. 
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+0.50, £1.00 + 1.50, +2.00, +2.50, and +3.00 for our problem 
are as follows: 


z/o y = (59.62) 
0 23.77 (mean ordinate) 
+0.5 20.98 
+1.0 14,42 
#1.5 7.71 
+2.0 3.21 
£2.5 1.04 
3.0 -26 


The frequency histogram representing the 149 scores on the 
Carnegie Foundation Tests and the normal curve plotted by the 
ordinates method are shown in Fig. 22. 


99.5 119.5 139.5 159.5 179.5 199.5 219.5 239.5 259:5 279.5 299.5 319.5: 339.5 
Fia, 22.—Normal curve graduation of 149 sophomore scores on the Carnegie 


‘Tests, ordinates method. 

Goodness of Fit.—The use of chi square in testing goodness 
of fit of the normal curve is illustrated in Chap. XIV. Another 
test of normality developed recently involves the ratio of the 
mean deviation to the standard deviation.! Geary gives a table 
of average ratios to expect for different n’s and also for the highest 
and lowest in 1 per cent and in 5 per cent of the samples. He 
also gives the standard error of these ratios which he calls wn- 

Joncerning the use of this ratio as a test of normality Geary 
says: 

From this investigation it appears very likely that, for quite small 
samples drawn at random from a normal universe with mean zero, the 
distribution of wn is fairly close to normal . . . The advantages of os, 
regarded as a function of the original variables z, are as follows: like Bx it 


1 Geary, R. C., “The Ratio of the Mean Deviation to the Standard 
Deviation as a Test of Normality,” Biometrika, Vol. 27, pp. 310-332 (1935). 
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assumes a characteristic value for infinite normal random samples; the 
values of its semi-invariants indicate that its distribution is far closer to 
normal, even for moderate samples, than that of Bs; and... its 
frequency distribution can be determined for all normal samples. Its 
principal disadvantage is that it is not symmetrical in the original 
variables . . . It would seem advisable to randomize the sample a few 
times and to calculate w, for each permutation. The mean or the 
median w, might then be taken as the representative value for the pur- 
pose of determining the probability of normality. 


For the distribution of wn (A.D./c) Geary employs the same 
technique as Student for ¢ and Pearson for x’, viz., joint proba- 
bility. He gives tables for the mean value to be expected, for 
Swn, and for the upper and lower 1 per cent and 5 per cent values. 
He holds that even for quite small samples from a normal dis- 
tribution the distribution of w, will not be far from normal. 
The probability points of wn for the upper and lower 1 and 5 per 
cent levels, the mean of wn, and the standard deviation of wn are 
given in Table XXVI for different values of N — 1, labeled n, 


Taste XXVI.—Tue 1 anp 5 Por Cunt Pronaniiry POINTS OF wa 


Standard 
deviation 


For the data of Table XXIV n = 148, ¢ = 50.9, and 
A.D. = 40.3. 
wn = A.D./o = 40.3 + 50.9 = .7917. From Table XXVI we 
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see that this value is within the 5 per cent level for an n between 
100 and 500 and is very close to the mean wn to be expected. 
On the basis of the Geary test we would conclude, therefore, 
that the sophomore scores on the Carnegie Foundation Tests 
are distributed normally. This agrees with the x? test of good- 
ness of fit discussed on page 418. 

APPLICATIONS OF THE’'NORMAL CURVE CONCEPT 


The normal curve is put to very many uses in educational 
and sociological research, some of which are discussed and 
illustrated at length in the elementary texts. To ask what use 
can be made of such a concept as that of normality of distribu- 

' tion is much like asking what use can be made of a lathe. A 
lathe is a tool which we can employ in making all sorts of products, 
according to our needs in the particular exigency; and the same 
is true of statistical formulas, including the normal curve func- 
tion. Nevertheless we shall list a few of the types of uses to 
which the normal curve function has been applied, describing 
them here very briefly and recommending further reading in the 
sources, or in the more elementary texts, for students who wish 
to pursue the matter further. 

1. To assign difficulty values to questions in ‘a test. As 
questions become increasingly difficult a larger percentage 
of pupils fail them, the percentage increasing slowly at first, 
rapidly around the middle difficulties, and then slowly again at 
the upper extreme, Difficulty values are assigned in terms of the 
distance along the x axis from the mean or from some other zero 
point to the ordinate which divides the proportion that succeeded 
with the question from the proportion that failed it. 

2. To assign difficulty values to different scores on a test. 
The technique is essentially the same as that involved in the 
paragraph above, except that the ordinate is located by the 
proportion made up of those who earned a lower score than 
the one in question plus half those who earned the same score.” 

3. To set standards for the distribution of grade marks. For 
this purpose the base line of a normal distribution is marked off 
into as many equal divisions as there are steps in the scale of 

1 Woopy, CLIFFORD, Measurement of Some Achievements in Arithmetic, 
Teachers College Bureau of Publications, Columbia University, 1916. 


2 McCarr, W. A., How to Measure in Education, The Macmillan Com- 
pany, 1922, p. 278. 
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grades and the area of each of these divisions is taken as norm 
for the percentage of individuals to receive the corresponding 
mark. Sometimes the base line is cut off at a range of five sigmas 
and sometimes a range of six sigmas. 

4. To indicate the numbers of pupils to be expected in each 
division when desiring to divide pupils into ability groups of 
equal range of talent. The technique is the same as under 3. 

5. To transmute marks distributed according to different 
standards of leniency. Suppose a teacher gives 20 per cent of 
his students A, 40 per cent B, 30 per cent C, 8 per cent D, and 
2percent E. By use of formula (1684), a difficulty value for each 
of these grades and a comparison with the difficulty of correspond- 
ing grades by other teachers may be determined, and, if desired, 
all grades may be transmuted to the same standard. To make 
such computations is left as an exercise for the student. 

6. To make scales for measuring the merit of handwriting, 
drawing, English composition, ete. For this purpose specimens 
of these objects are ranked into overlapping distributions 
and scale values determined from the mean of one of these 
distributions to the mean of the next. Thurstone has recently 
extended and improved upon this technique in making attitude 
scales. His essential addition consists in taking the ø of one of 
the distributions as standard and stating all steps in terms of 
this standard, thus securing more consistent units than by the 
older method.! 

7. To make scales for measuring the mores of society, and to 
measure deviations from morality.” 


References for Further Reading 


Pearson, Karu: “Historical Note on the Origin of the Normal Curve of 
Errors,” Biometrika, Vol. 16, pp. 402-404, 

Romanovsky, V.: “Notes on the Moments of a Binomial (p + g)” about its 
Mean,” Biometrika, Vol. 15, pp. 410-412, 


1J. Abn. and Soc. Psychol., Vol. 21, pp. 3884400; or The Amer. Sociol., 
Vol. 31, pp. 529-554. 

2 Perens, C. C., Motion Pictures and Standards of Morality, The Macmil- 
lan Company, 1933, Chaps. II-V. For an extended account of the uses of 
the normal curve in psychological and educational research see J. P. Guil- 
ford, Psychometric Methods, McGraw-Hill Book Company, Ine., 1986, Chaps. 
IV-IX. For a shorter account see H. E. Garrett, Statistics for Students in 
Psychology and Education, Longmans, Green & Company, rev. ed., 1937, 
Chap. VI. 


CHAPTER XI 


THE CORRELATION RATIO 
CURVILINEAR CORRELATION 


When we treated standard error z estimate (pages 112 to 117), 
we found that cet, = oyV/ 1 — fy. This standard error of 
estimate we saw to "be the add deviation of one of the 
columns of the correlation table, on the assumption that all the 
columns have the same standard deviation. We may denote it 
ce as well as cat. Using this notation, squaring, and proceeding 
with several other algebraic transformations, 


= a(l — ra) = oy — Os ey3 Ogray = Ty — o 


2 2 
a E E a 2 
zy 2 2) zy 2 
% Oy Cy 


Thus an 7 could be computed in terms of the standard devia- 
tion of a column and the standard deviation of the whole distribu- 
tion. The v is, thus, determined by the extent of the scatter 
of the columns in comparison with the extent of the scatter in 
the whole distribution. However, o? must be calculated from 
the regression line as origin, and this cannot be done in advance 
of a knowledge of the r itself. But we can get a convenient meas- 
ure of relationship by giving up the demand that se be computed 
from measures taken as deviations from the regression line and 
by letting the measures from which it is computed be deviations 
from the mean of the column. The value of our correlation will 
nae be quite the same, so we must employ a new symbol 

or it. 


mz = 4/1 — a (Correlation ratio) (173) 


This coefficient, eta, may include the case of curvilinear correla- 


tion. It gives us a measure of the extent to which the y scores 
312 
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for each given z value are grouped compactly together and, 
consequently, indicates the degree to which some law is present 
in the relation between the x and the y factors, but the line of the 
means may become a nonrectilinear one. Hence we may not 
use 7 in the simple regression equation developed in connection 
with r, for that is an equation for a straight-line relation. The 
accompanying correlation tables, showing for pupils in two grades 
of the rural schools of Centre County, Pa., the relation of scores 
in the Otis Classification Test to pupils’ ages, depict this sort of 
situation. It will be seen from the formula that 7 varies between 
Oand 1. For, if there is no scatter of the columns at all, so that 
all y scores for a given x value come at exactly the same point 
showing complete determination of y values by v values, o?, = 0, 
the value under the radical becomes 1, and its square root is 1. 
If there is no law operative, so that scores in each column scatter 
as widely as the whole distribution does, of, = oj, and we have 
under the radical 1 — 1 = 0, so that n = 0. 7 can never be 
negative, since its only function is to show the degree of the pres- 
ence of a law—and that degree can run only from none to com- 
plete. But 7 will always be exactly equal to r (in the case of 
complete rectilinearity) or greater than r—never less. This is 
because a standard deviation is always the least possible when its 
deviations are taken from the mean of its distribution, as they are 
in the case of n. Thus o? taken fromthe regression line, which 
lies outside the mean of at least’ some of the columns except 
in the case of strictly rectilinear regression, is greater than o3. 
Hence less is subtracted from the 1 under the radical in the y 
formula, and, consequently, the 7 is greater than the correspond- 
ing r. 

In practice the formula for 7 is usually put into a different 
form from that given above. To get it into the conventional 
form, we shall square it and carry it through a simplifying process. 


2 2 

i a hy 
Pe ens eee 
og og 


The mean of a column lies at a distance of d, we shall say, 
from the mean of the whole distribution. If our measures are 
in deviation form, this d will equal, of course, Eye/ne, where ne 
is the number of cases in the column in question. Then for any 
one column, taking our measures as deviations from the mean 
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of the whole y distribution as Said, 


Summing for all the columns weighted for frequency and dividing 
by the sum of the frequencies, 
nee, | D(Zy2) __DMe( Dye/Ne)* 
N N N 


If we make the assumption of homoscedasticity this becomes 


Noz, 22y? _ nM}, 2 EA Foe | AORN 


NANON: NG ea pena 


Making in the n? formula above the substitution of the value just 
shown, we have 


2 
y= a Nye = = (Second form of the correlation ratio) (174) 

Thus 7 equals the standard deviation of the means of the 
columns divided by the standard deviation of the entire distribu- 
tion. The formula for the regression of x on y would obviously 
be May = om,/o2, where, of course, in the former case the columns 
for which the standard deviation of means is taken are columns 
of y values for particular values of x, while in the latter case they 
are columns of x values for parti@ular values of y. 

The formula for 7 is frequently employed in just the form given. 
But we can put it into a more convenient shape and can pair it 
with a formula for r for the same data, by- a little algebraic 
transformation. 

The mean of any column is Zy./n. where n is the frequency for 
that column. We wish to find the standard deviation of the set 
of means involved in all the columns. We shall employ the 
formula for o? with zero as the assumed mean, which is 


CZA 2fX a 
a= N - (2) 


But instead of aggregating the moments as we go, we shall let 
them stand in the formula as they result from each of the separate 
columns. Our frequencies for the successive columns, 0, 1, 2, 


THE CORRELATION RATIO 315 


3, . . . , we shall represent by no, ni, na, na, -... Ourmomerts 
must, of course, be weighted by these frequencies. Our standard 
deviation squared for the means of the y columns will be, then, 


1 Dyo\ By\ 2y 
2 J* 1 
Testa [» (2) Te ny Ta Na 


ea T 


The N refers to the total population in contrast with the n’s of 
the several columns which refer to the populations of their 
respective columns. We may now combine the n’s with the 
quantities in parentheses and have 


1 (By BH, Ba Dus Dh 
ee fx s. ET AEN: 
cae Df 4mm m NS) 


The sigma of the whole set of y’s, which we need in squared 
form for our denominator, is given by the familiar formula 


gt = 2h Gey 
CN ENN 


Substituting this value for o2 and then multiplying both numera- 
tor and denominator by N*, we have, for 7’, 


Beak AT it A riL AM E S88 Pea a E 
no i nı 4 ne t ns i oN 


2 
an N Zy} — Eyy 
(Third formula for °) (175) 


Note that this is „?. Do not forget to extract the square root. 
Now we can nicely pair a formula for 7 with this one by merely 
summing our zy values by columns and letting these partial 
sums stand in that form in the formula. We shall make zero 
our assumed mean in respect to both arrays; so our x values 
will be, successively, 0, 1, 2, 3, etc., up to one less than the number 
of columns. Of course, the formula would be essentially the 
same if we took some other assumed mean than zero, only then the 
partial sums would have the minus sign at the left of this assumed 
mean and the plus at the right. With these paired formulas we 
can easily get both 7 and r from essentially the same operations, 
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only a few extra minutes being required to get either one when 
the other has been computed. The following is the r formula: 
wy N(Zy: + 22y: + 382y3 + 42y ++ °° Is Iry’ Dyn 
V (NEY — Ey) (N 2x3, — Dry) 


(Formula for r paired in 
structure with the eta (175a) 


formula) 
The P.E. of 7 is usually given as 
1—7? 
P.E., = .6745 
” VN 


We shall now apply these two paired formulas to the computa- 
tion of n and r for the data of Table XXVII and present similar 
data for another grade for comparison and as an exercise for the 
reader. The data are scores on the Otis Classification Test 


‘Taste XXVII.—Scorus on THE OTIS CLASSIFICATION TEST BY CHILDREN 
IN THE EIGHTH GRADE OF THE RURAL SCHOOLS OF CENTRE COUNTY, 
PA., DISTRIBUTED ÅCCORDING TO CHRONOLOGICAL AGE 


Chronological age; years-months 
Seose 12to| 13to| 14to| 15to| 16 to| Pts} Y | JY | SY? 
10-11 |11-11 14-11 (15-11 |16-11 

140-149 1| 12 12) 144 
130-139) 1 1 3| 11 33| 363 
120-129) 1 2 8| 10 80| 800 
110-119) 4 6| 9 54| 486 
100-109) 6} 4 22| 8| 176/1,408 
90- 99 8 21| 7| 147|1,029 
80- 89 9 2 29) 6) 174/1,044 
70- 79 9 46| 5) 230/1,150 
60- 69 9 1 37| 4) 148) 592 
50- 59 19) 9 40} 3| 120) 360 
40- 49 2 10) 23| 2) 46 92 
30- 39 Seen LE 9 1 9 9 
20- 29 a 2 4) 0 gq 0 

Totals. . 

ZY. 

ZY2/ne 

X 

JX 

JX 

eai 
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made by pupils in the rural schools of Centre County, Pa., in 
1928 in the eighth grade and in the sixth grade. The scores are 
distributed according to the chronological ages of the pupils, 
and our problem is to find the correlation between the ages and 
scores within a single grade range. 

The summation of the y moments by columns, Zye, is obtained 
in precisely the same manner as in the Pearson product-moment, 
method of correlation described in Chap. IV. Here, as there, the 
sum of the y moments by columns equals the sum by rows, so 
that we have a check on the correctness of the work. That sum 
in this problem is 1,229. Applying our formula, we have 


a _ 249(51 + 605 + 1,266 + 1,531 + 1,930 +777 + 16) — 1,229? 
Tye 249(7,477) — 1,229? 


= .0772 


Taking the square root, nuz = -277 + * > 
249(110 + 2-225 + 3-298 + 4-383 + 5:189 + 6-8) 
— 1,229 - 832 
~/ (249 - 7,477 — 1,2297) (249 + 3,212 — 832?) 
= —.163 


i 


Tapis XXVIU.—Scorzs ÖN THE OTIS CLASSIFICATION TEST BY CHILDREN 
IN THE SIXTH GRADE or THE RURAL SCHOOLS or CENTRE County, 
PA., DISTRIBUTED ACCORDING TO CHRONOLOGICAL AGE 


Chronological age; years-months 


Scores 
14-11 | 15-11 


110-119 
100-109 
90- 99 
80- 89 
70- 79 
60- 69 
50- 59 
40- 49 


m. Aone my 


4 
7 
8 
1 
4 
7 
1 


3 
E 
Ed 
> 
8 
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It will be observed that the 7 is larger than the r, as we said — 
above must be the case whenever they are not identical with each — 
other. Also the 7 is positive, while here the r is negative. Any — 
is always positive, since it indicates only the extent to which the 
columns are shortened in comparison with the total distribution 
and hence the extent of the operation of some law causing the y 
scores to be more or less definitely placed for given values of v. — 
What that law is we can know only by a further examination — 
of the trend, while r indicates both the extent of the law and the — 
direction of the trend, : 

The probable error of our 7 is 


1 — .277? 
P.E., = .6745 aS . 
Since n is nearly five times its probable error, it is clear that a 
law is operating to place y scores in terms of x scores; 7.¢., there 
is a correlation between scores and chronological age within this — 
eighth grade range. : 
Applying the same two formulas to Table XXVIII 7 turns out 
to be .222 and r to be —.154. The two grades show, therefore, 
very consistent results, : 
Hitherto the chief use made of 7 was to test rectilinearity of 
regression. The regression lines in Tables XXVII and XXVIII 
are both curved. Is that due merely to chance sampling or is 
there a significant departure from rectilinearity which may be 
expected to persist with successive sampling? The extent of 
departure of n from r is a function of the departure of the regres- 
sion from reetilinearity. It has been customary to test the 
significance of this departure by applying Blakeman’s test of 
significance of (y? — r°). In the previous edition of this book 
we explained that test and applied it to Tables XXVII and 
XXVIII. But we found it to give results which were not plaus- 
ible. The inadequacy of the Blakeman test is now recognized, 
and we are dropping it from our treatment here. Instead, we 
recommend applying the x? test of goodness of fit to test the. — 
significance of the departure of the actual regression from recti- 
linearity. Fisher gives a value for x’ which, for the straight line 
‘Fisner, R. A., “The Goodness of Fit of Regressi y”? J. Roy. 
Statistical Soc., Vol. 85, Part IV, pp. 597-612 (1936). ig pres 


not quite the conventional one for x2, though close enough for 1 samples. 
For small samples Fisher gives a correction.) 3 mer 


-058 


q 
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as the theoretical value, reduces to 
ps as Cia ds r 
x (N k) 1- 
In this the N is the population of the sample and k is the number 
of columns. The result is interpreted in the manner explained 
on pages 410 to 419, with the aid of Table XLVII, page 498. 
The x? table must be entered with n = (k — 2) or n' = (k — 1). 
The value of x? calculated by this formula for Table XXVII is 
13.26, Entering the table of x* values with n = (7 — 2), which 
is 5, we find that a xê of 13 shows a P of .023379, and x* of 14 
shows a P of .015609. Linear interpolation between these for 
x? = 13.26 gives P = .021359. The probability is, therefore, a 
little more than .02 (two chances in a hundred) that a discrepancy 
as great as the one obtained in this problem might arise as a 
matter of chance fluctuation even though the true regression were 
rectilinear, That would leave the hypothesis that the regression 
might be rectilinear not wholly refuted, though rendered rather 
untenable. The fact, however, that a second sample shows a 
departure from rectilinearity in the same direction (both regres- 
sion lines having thé same general shape) further weakens the 
hypothesis that the true regression line might be rectilinear. The 
first sample alone gives a fairly significant (beyond 5 per cent), but 
not highly significant (1 per cent or less), difference from recti- 
linearity; but the two samples jointly give highly significant 
evidence of the curvilinearity of the regression,! 
A CORRELATION RATIO WITHOUT BIAS 

Unfortunately n is affected by the number of items in the 
several classes as well as by the inherent extent of correlation. 
For, as we saw on page 69, the variance of a class shrinks more 
and more, compared with its true population value, as the n 
decreases. In the numerator of the fraction in the formula, 


7 
t_j- 
(4) a a 
the true population value would, on the average, be no2/(ne — 1), 
while that of the denominator would be wo of. Because te 


(x? in terms of „* and r) (176) 


10n page 827 we give s better test for the goodness of fit of regression 
lines. 
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is not equal to N, the value of the fraction and, consequently, 
the value of 7? will be affected by the population of the sample, 
or by the number of classes into which the total population is 
divided. In order to get a correlation ratio independent of this 
disturbance, Kelley! developed recently a.new formula for the 
correlation ratio, which he designated e. 

If we employ population variances instead of sample variances, 
the estimates of the variances will not be altered by reason of 
smallness of populations in the columns, or in the whole y dis- 
tribution, because our method of estimating population variances 
takes care of that. We learned on page 70 how to substitute for 
sample variance an estimate of the population variance: we 
need only divide our sum of squares by (N — 1) instead of by N. 
Or, if we already have o°, s? (the estimate of the population vari- 


ance) is merely A I a°. So instead of oż in the n formula we 
need only to use wot For the numerator we need the 


population variance of the o?’s. If our assumption of homo- 
scedasticity were met perfectly, we could.get that by merely 
Ne 
NM —1 
we shall do better to estimate the population variance of the col- 
umns by taking a weighted average from all the columns. Let- 
ting s?, stand for an estimate of the population variance of a 
particular column and letting Ne, be the number of items in that 
column, 


taking 


og where o? is computed from any column. But 


na Ne, — 1 
Clearing of fractions, 

(Me, — 1)s?, = neo? 
Summing for all the columns, 


k k 
È (ne, = D, = È ne, 


where k is the number of columns, n4, stands for the population 
of any column by which the az, are to be weighted, and the 


‘ Keuter, T. L., “An Unbiased Correlation Measure,” Proc. Nat. Acad. 
Sci., Vol. 21, pp. 554-559 (1935). 
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symbols above and below the = indicate the limits of summation. 
Assuming homoscedasticity for the purpose of estimating the 
population variance for the columns, but retaining on the right 
differing o? values and weighting them for their population 
values, and dropping the j subscripts, 


Ena? 
N-k 


Substituting these two estimates of the population variances in 
Eq. (A) and using ¢ to designate the estimate of 7? (for we can 
never know 7, the true population ratio, but only estimate it), 


(Ene — k)s? = =n.02; whence sè = 


Ena? 
NER _ 2noi(N —1) 

eel "Na | No =®) ar 
N-1 


(Correlation ratio without bias) 


The elements of this formula can readily be computed from 
the squares of deviations from the means 
Ena? = DE(ye — Ja)’; No} = Ay — 9)? 


Or they can be computed from the scores by the following 
equivalent formulas: 


Ss ree 
wey X 
Eno? = (z - 2%); No? = (2 — T 


Or, if 7? has already been computed, we hig substitute it 
in formula (177) and get e in terms of 7”. Preserving the 
weightings of the variance of the columns for the differing 
populations of the columns, 


z Ena? 
„=1— Wa whence Nat =1-7 
N= 1 
and from the above ê = 1 — We ENER 
=a k) — (N — 1) + (N = 1)? 
é=1— Ra- —7) = ) a ar 
Gea OT, (€ in terms of 7%) (178) 


N-k 
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If we substitute zero for in formula (178) we shall get a value 
for the average n? when the true 7 is zero. Calling this n3, 


W=Di= EZD oN =k 


N—k 
ASRI Kl TI age value of 
m TON T m= Nei ee wees in soto 4 (179) 


Since o? is always less than oj and since no could extremely 
rarely be expected to be exactly zero, it is easy to see why n 
should tend to have a slight positive bias. Kelley’s formula 
corrects for this. 

Besides the constant positive bias in n, there is a second 
disturbing factor for which the e technique corrects in part but 
for which a further correction is needed; z.e., the dependence ef n 
upon the number of classes into which the æ distribution is 
divided. One reason why n with a large number of categories 
differs from the value in the same situation with a smaller number 
is that with the larger number the populations in the several 
classes are lessened. If there were as many classes as the total 
number of items N, n would necessarily be unity; for with a 
single item in each column the variance of the column would be 
zero. But in the total population, which is hypothetically 
infinite in size, the variances would remain the same regardless 
of the narrowing of classes, because the populations in the classes 
would still be infinite. Thus to the extent to which e overcomes 
the effect of smallness of populations in the classes, it corrects 
for differing numbers of categories. In the direction of fineness 
of grouping this constitutes the necessary correction. But in 
the direction of broad categories a disturbing factor still remains. 
If a category is broad enough to combine within it a number of 
elementary classes, the means of these several elementary classes 
will differ from the mean of the combined elementary classes 
to the extent to which there is present regression which differs 
from zero. Thus the variances of the broad classes will be some-' 
what too great as compared with the variances of the constituent 
elementary classes, and e from broad categories will be somewhat 
too low. But a satisfactory correction for this is easily made. 
If it is assumed that, within each broad class, the regression is 


rectilinear and the slope is represented by te% exactly the 
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same technique may be applied as that employed for correcting 
rectilinear correlation for broad categories, described on pages 
393 to 399. This assumption is not strictly true, but it is the 
best simplifying assumption that can be made! and is correct 
to a good degree of approximation. On this assumption the 
correction involves merely dividing the obtained e by the product 
of the r’s between index values and variates in each of the arrays, 
which 7’s are tabled on page 398. 


ec = £ (e corrected for broad categories) (180) 


P 


We shall now apply these two corrections to the 7? computed 
for Table XXVII, page 316. First the correction for bias is as 
follows: 
a (N — 1)n? — (k — 1) _ (249 — 1)(.0772) — (7 — 1) 

(N — k) (249 — 7) 


= .0544 
€ = .233 
To correct for broad categories we must divide the e obtained 
above by the product of the 7’s between index values and variates 
for each of the two distributions. These depend upon the 
number of categories and the assumed shape of the distributions. 
Both the distributions as to age and as to achievement are 
approximately normal, so we use the third column of r’s in the 
table. For ages the number of categories is 7, and the cor- 
responding r between index values and variates in the row for 
7 categories is .970. In respect to achievement there are 13 
categories, for which the tabled 7 is .991. Dividing by the 
product of these two r’s, we have 


Nunes ty CTAN ie 
€= (970)(.991) ` 


This corrected e has a standard meaning, free from bias 
and independent of the size of the population of the sample and 
of the number of classes into which the sample is divided. In 
this form it is free from the objections on account of which 


1 This is the same assumption that is made by STUDENT in deriving a 
different formula for correcting n for broad categories. Srupenr, ‘“Correc- 
tion to Be Made to the Correlation Ratio for Grouping,” Biometrika, Vol. 9, 
p. 317. 
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Fisher dismissed ņ as of “extremely limited” utility. Thus 
corrected ¢ should have a wide and important usefulness in 
statistical research. We show below that it has all the merits of 
Fisher’s analysis of variance technique and has, besides, a con- 
structive meaning which makes it a positive rather than a 
merely negative utility. 

In the article previously referred to, Prof. Kelley derives a 
formula for the standard error of e° and of e. 


s rp 1 
te = WA (or oh se? (Stands error (181) 


which holds when 1/N is not small in comparison with 1 IVN. 


= = 
T = er [= ate se| Gindon error (182) 


which is satisfactory if eis not small. 

The interpretation of the values obtained from the application 
of these formulas requires a knowledge of the form of the dis- 
tribution of samples around any hypothetical true value of the 
statistic. At present we do not know that distribution (except 
as treated in our next paragraph). If the correlation is not 
very high and the population reasonably large, we may take 
the distribution to be normal, with little risk of appreciable 
distortion in our interpretation. We raise that question of 
distribution in our next paragraph. The reader may wish to 
compare the outcome from the technique of our next paragraph 
with what he would get from formula (181) on the assumption 
of a true value of zero for e and the use of the table of the normal 
distribution. 

Testing the Null Hypothesis.—Frequently we wish to know 
whether any law at all is present or whether the e we have from 
our sample might reasonably have arisen merely by chance 
fluctuation in sampling. For this purpose we need to know 
the distribution of e? when the true correlation is zero. We have 
made tables for this distribution which are presented on pages 
494-497 of this book. To test in this way the e? of our illustra- 
tion, we enter our table with (k — 1) equals 6, (N — k) equals 242. 
We do not find there a row for 242, so we shall interpolate between 


1 Fisnur, R. A., Statistical Methods for Research Workers, 7th ed., p. 264. 
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the rows 200 and 400. In row 200 and column 6 we find .032 
for the 5 per cent point and .052 for the 1 per cent point. In row 
400 the values are .016 and .027. Interpolation gives .029 and 
047. The & in our illustration is .054, which is larger than 
even the one which stands at the 1 per cent point. This means 
that, if there were no law relating educational score to age within 
a single grade, we would get such a large e considerably less than 
1 time in 100. The null hypothesis is, therefore, disproved; it 
is highly probable that there is some law relating educational 
achievement score and age within a single grade, and the extent 
of this law is expressed by a correlation ratio of .242. To 
determine what is the character of that law should be the next 
step in our research with these data. That step would be curve 
fitting, which we discuss in Chap. XV. 

On a later page (337) we give data which are there employed 
to illustrate analysis of variance. Those same data can easily 
be worked up into e°, as follows: 


s 44,390 _ ig 

a Tae ae 
Here (k — 1) is 4 and (N — k) is 25. Entering our table with 
these n’s we find an e of .195 at the 5 per cent point and .805 at 
the 1 per cent point. Our obtained e? is much greater than even 
the 1 per cent value, which means that, if there were no law 
present, so large a value would be obtained much less than 1 time 
in 100, This is entirely consistent with the showing by the 
technique of analysis of variance, as reference to page 338 will 
show. In fact, the test by the epsilon technique and the analysis 
of variance technique will always give precisely the same 
results. Thus we see there is a law relating breed of cattle to 
milk production and the extent of this law is expressed by the 
correlation ratio .845, the square root of e. What the law is 
must be determined by comparing means and variabilities of 
production among the breeds with large samples and controlled 
conditions. Whether or not for this test e should be corrected 
for broad categories depends upon whether the classes are con-_ 
tinuous and arbitrarily divided into classes or whether they may 
be more sensibly regarded as centering around point values. 
The former of our illustrations undoubtedly involves the former 
character, while the second is probably of the latter class. 


=1 


e@=1- 
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THE PARTIAL CORRELATION RATIO 


Any formula for partial 7 in terms of lower order correlations 
would either demand that we know and apply the equation of 
the regression curve, which would be too cumbersome to be 
practical, or that we make assumptions about this curve which 
would be too hazardous to risk in practice. But a partial 
correlation ratio can be determined by selection. Suppose we 
have, in a large population, scores on general intelligence (x), 
high-school scholarship (z), and academic success in college (y), 
and we wish to know the correlation ratio between high-school 
scholarship and college success with the general intelligence 
factor held constant. We can sort our individuals into classes 
on the general intelligence (x) factor, then subsort these classes 
according to high-school scholarship (z). After both x and z 
are thus held constant, these subclasses will still have a certain 
variance due to factors other than x and z. 

If we denote the weighted average y-variance of the subclasses 
by ož, and that of the x classes by o7, then by definition of the 


partial correlation ratio, ` 


a, (Partial n°, y on z with 
$ held AEREA (183) 


For partial epsilon this would be 


Inmo? (N — k) : A 
ENA pee 8.7 86 (Partial e, 2 with 
Sen = Enaot(N — kp) z held Rouen (184) 


where k ıs tne number of classes into which the population is 
sorted on x, and p is the number of classes into which each x class 
is subsorted. One could get a tentative idea of the extent of the 
partial correlation by determining the variance of one sample of 


1L, Isserlis derived such a formula: “The Partial Correlation Ratio,” 
Biometrika, Vol. 10, pp. 391-411. In spite of the title, the Isserlis formula is 
for multiple eta instead of for partial eta. But it can lead into the latter by 
substituting the value found for multiple eta [H,2:)] into the formula: 
nye = (Hga — nje)/(1 — jz). But the derivation assumes that the 
regression of y on x for z constant is rectilinear and also that of z on y is 
rectilinear for z constant, These are too hazardous assumptions for refined 
practice, 
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y scores from a single x class and then drawing from this a sub- 
sample for a single z value and computing the variance of this 
subclass. The partial correlation ratio would be, so far as this 
meager trial could suggest, the square root of the difference 
between these two variances divided by the variance of the 
x class. This would yield more valid results to the extent to 
which*the average from a number of subclasses was employed 
rather than a single one; and, of course, for a good determination 
the values should be summed over the whole table. With the 
Hollerith machine equipment this should not be a very difficult 
process. 


TESTING THE GOODNESS OF FIT OF ANY REGRESSION LINE 

In terms of e we can.easily make an “exact” test of the good- 
ness of fit of any regression line. This parallels the x test on 
page 319 but is a more precise one and applies not only to a 
rectilinear regression but to regression lines of any shape. We 
said that, after having found by the correlation ratio technique 
that some law is present in our data, our next concern would be 
to investigate the nature of that law. One method of doing this 
is to seck the curve that best fits the trend. The technique of 
curve fitting is considered in Chap. XV. Having fitted a 
promising curve, we would next wish to test mathematically the 
goodness of the fit and hence the appropriateness of the type of 
curve fitted. Our formula for doing this is derived as follows: 

Refer again to such a layout as that of Table XXVII, page 316. 
Conceive of a new set of derived values, y’, each of which is 
the original value taken as a deviation from the point on the 
regression to which its column belongs. These derived values 
would make a new table of columns with a new set of means 
fluctuating about a line of zero slope. For this derived table we 
can have a new correlation ratio, 


(B) f=1-+ 


The variance of each new column will be the same as before, 
since there is no change except that a constant has been sub- 
tracted from all the scores, which does not affect the variance. 
But s} will differ from sj. We must find a value for sy. In 
a given column, if J; is the value of the point on the regression 
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line in terms of deviations from the whole y mean, 
way + Ja yt =P + IE ydi 
Zy? _ ZuE , BF qo Zuid 
Me, rad Ne, ri Ne, 


d i 


Summing for all the columns weighted for their frequencies and 
dividing by the sum of the weights, 


Pa fae Dyes Dy: 
NON N AN 


But, if the regression line is the best-fit one, the last term above 
will sum to zero over the whole sample, or substantially so. 
Therefore, 


oł = o} + oł. And, transposing, a} =o — a} 
Multiply through by N/(N — 1), 


© $=- a? 


Now define a new term, R?, so that R? = 03/08, whence 


Substituting in (C), 


g=- Rg = 40 R’) 
Substitute this value in (B), 


ENA a a Sal ASR! s 
Lees aim- [ER - ar 


bai: s 
-rph -r-2) 
D 1 E SN EE 
-rieh r 41-9) -fSe 


Therefore 


d= (185) 
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If the fitted line is a rectilinear one, R is merely r, the coeffi- 
cient of correlation. This follows directly from our showing 
on page 240 that the standard deviation of the points on the 
straight regression line equals roy, For, by definition, 


ee roy 2 
pee a wer 
Cpe step 


For other regression lines general formulas can easily be made; 
or the standard deviation of the points on the regression line 
can be computed direetly by reason of knowledge of the frequency 
at each column and the ability to calculate the J; value at each 
column from the equation of the fitted curve. 

We may apply this technique to testing the rectilinearity of 
regression for the data in Table XXVII, employing the values 
of the statistics found earlier in this chapter. 


a _ @—r _ 0544 — (—.163)? _ 
Fiera me =o 808) ae Sess 


€ 


e’ has the same form of distribution as e, so we use the same 
tables for interpreting it. ‘The value cited on page 325 for this 
table for e when the true correlation is zero was .029 at the 
5 per cent point and .047 at the 1 per cent point. So the obtained 
value for é” lies only a little distance below the 5 per cent point 
and the departure from rectilinearity is shown to be barely 
significant. This tallies with the other tests made earlier in this 
chapter. 

When a parabola is fitted by the methods of Chap. XV, its 
equation turns out to be 


Y = 4.1812 + 1.1091X — .2288X? 


The R? is found, by computation, to be .253. Therefore 


@— R? _ 0544 — 0652 _ 
iT oes 


Ce 


This is a low value, much below the 5 per cent point, .027. So 
the parabola gives an excellent fit. The e° is negative, which 
means that the deviations are actually less than they would be 
on the average if the true regression line were the one which 
we fitted. n? can never be negative, but ¢* can be. If the true 
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relation is zero, the ¢”’s from samples must average zero, so that 
some samples must yield negative e”s. 


Exercises 


1. Compute n and e for Table XXVIII, and determine the probability 
that there is a true correlation above zero; the probability that the regression 
is rectilinear. 

2. Test the rectilinearity of regression in Table IX, page 100, by both the 
x? and the e” tests. 

3. Apply the correlation ratio technique to the exercises used in the next 
chapter and compare it with the analysis of variance technique. 

4, With a large population of suitable scores from a study to which you 
have access, try computing a partial 7 and partial e. 
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CHAPTER XII 
ANALYSIS OF VARIANCE 
ANALYSIS OF THE SAMPLE VARIANCE 


‘At least to persons already familiar with rectilinear and curvi- 
linear correlation, we believe that the most illuminating approach 
to the now popular analysis of variance is through these familiar 
concepts. We shall show first that, so long as we stay within 
the sample, analysis of variance is an extremely simple process. 
After we have shown this, we shall broaden the concept and 
lead by successive steps into some of its more complicated 
ramifications. 

The reader is asked to refer again to a typical correlation chart, 
such as that on page 100. It will be observed that there remains 
some scatter in each y column even though all the individuals in 
the column have the same © value. In other words, when x is 
held constant, there still remains some variability in the y scores. 
But, when correlation is present, this variability is less than that 
for the whole distribution; put in terms of proportion it is o2/0q- 
Since this is the proportion of the variance (the o?) remaining 
when z is held constant, it may be considered the proportion of 
the variance in y attributable to the factors in y other than 2. 
Conversely, the reduction in variance when z is held constant is 
the part of the variance attributable to the x factor. Put in 
terms of the proportion of the entire variance of y, this is 


Now, as shown on page ‘PLT; 


$ ao -o 
pal ele. 2 = 1-2 = 2A 
r= aj! ai! whence T 1 a a 


Thus the total variance may be divided into two portions of which 

the proportion attributable to the x factor (or rather, to what is 

common to x and y) is equal to r? and the proportion attributable 
331 
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to the other factors is o2/o2 = 1 — r?. r? is sometimes called the 
coeficient of determination; hence the proportion of the total 
variance attributable to the factor that is correlated with y is 
the same as the coefficient of determination. 

But in the above formula o2 was taken from the straight regres- 
sion line as origin rather than from the means of the respective 
columns. That is not what is customarily done in computing a 
a. If the reader will now refer to the correlation ratio, page 312 
in our preceding chapter, he will find this limitation removed 
in the following formula for the correlation ratio; 
aea ia 


EE 

where o2 is computed from the means of the columns as origin. 
This is true for r? only when the regression is strictly rectilinear. 
Hence it is always true (assuming homoscedasticity) that the 
proportion of the variance attributable to the « factor is n? 
and the proportion to the other factors is o2/o?, which is also 
1 — 7%. In formula (174), page 314, it is shown that 


ee E 
ME So: L Sa 
: ee 

oy oi 


Hence we may restate the above in the following form: The 
variance of y is separable into two parts; the proportion attribut- 
able to x is o3,/o2 = 4; the proportion attributable to factors 
other than 2 is 42/02 = 1 — n?. 

Since these relations are fundamental in the problem of analy- 
sis of variance, we shall make another (independent) approach 
and arrive at the same conclusion. But we shall adopt a more 
conventional notation; instead of using a subscript m to denote 
a mean, we shall place a bar over the letter representing the array 
of which it is the mean. Thus, cj means the same as om, We set 
up the following identity, for a single score: 


U-77) = (y — 7) + Ge — 9) 


the barred y without a subscript referring to the mean of the 
whole y series while that with the subscript c stands for the mean 
of the column in which a particular y score is found. Squaring, 


U = 9)? = (y — 0)? + Ge — 9)? + Ay — g) — 7) 
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Now sum for all individuals, first by columns and then across 
columns for the entire table. Let k denote the number of columns 
and n, the number of individuals in a column, the letters below 
and above the summation sign indicating the limits between 
which we sum. 
By(y — g)? = DWBe(y — g)? + BBG. — 9)” 

+2242 — I)e — D) 
As long as we sum within columns, (Je — 7)? will remain the same 
within the several columns. When we sum across the columns, 
we shall get 

Bne(Je— 2 

When we sum by columns in the cross-products term, (Je — g) 
will remain a constant; but for each column X(y — Ge) will be 
zero, since the y’s are taken as deviations from the mean of the 
column g, as origin. Hence, in summing for the whole table, we 
have 
(A) BY(y — g)? = DWBtly — G)? + Tims Ge — > 
By reason of the meaning of a sample variance, this can be 
written 3 


No? = Lee, + Bnei A 


If, now, we assume homoscedasticity, we shall have 
No? = Enos + Eneo, 


But Ena = N, since the sum of populations by columns gives 
the entire population. Whence 
No? = No? + Noi, 
Dividing by N, 
Bary tae is es 
= at+ oor ga tor = 1 


Thus again we are brought to the fact that the sample variance 
may be analyzed into two portions, one of which is the variance 
within the columns (or classes) and the other of which is the 
variance of the means of classes. 

In this development we have carried the general case where 
the populations in the various classes may be different; hence 


334 STATISTICAL PROCEDURES 


our o”’s are weighted for the frequency of the several classes 
contributing to them. If we had chosen the special case of a 
symmetrical table, where all classes have the same frequency n, 
the derivation would have been much simpler. 


ANALYSIS OF THE POPULATION VARIANCE 


So long as we confine ourselves to analysis of the sample 
variance, analysis of variance is a very simple and straight- 
forward process; it is merely another way of expressing what can 
be put as n? or as rè. But R. A. Fisher, who introduced the 
technique, chooses to project the analysis into the population 
variance (see page 69) rather than keep to the sample variance. 
There is available a precise test of reliability on that plane. We 
shall now turn to that form, 

We cannot carry the general case by the Fisher method; we 
must restrict the application in two respects: (1) We must 
(if we are to follow strictly the mathematical requirements) have 
always a symmetrical table—each class (column) containing 
the same n; and (2) we must limit the problem to the application 
of the null hypothesis, 7.2, we must assiime a homogeneous 
population (no correlation) and test to see whether that hypothe- 
sis is tenable. We reenter our development above at (A). But 
since the n is to be the same in each column, this will take the 
following simple form: 


ZY — 9)? = Xil — g)? + nEle- 9)? 


where n is the number in a class (the number of rows) and k is 
the number of classes (of columns). We may estimate a popula- 
tion variance from the first sum of squares by dividing by 
(N — 1), and from the second by dividing by (N — k), as shown 
on page $21, From X(ğe— )* we can estimate the variance 
of the means of the infinite supply of random samples which 
make up the population by dividing by (k — 1). But remember 
that, if we are dealing with random samples, of the same size 
n, oh = &/n, where 4 is the true population variance. So, clear- 
ing of fractions, č? = no2, Hence the last term, n=(g,. — 9)2, 
can be made to estimate the population variance by dividing by 
(k — 1). Thus we have three estimates of the population vari- 
ance as follows, derived from the sums of squares and the degrees 
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of freedom standing below them: 


el oH rA 


Ium g) EE(y — Je)? n2 — 7)? 
x N-1 N—-k kA 


But we may no longer carry the equality sign between the first 
and the sum of the two others, because they have been divided by 
different values. Each, now, estimates a population variance. 

The first one estimates the variance of the measures including 
all factors. The second estimates a variance for a hypothetical 
population in which the z factor is held constant but the other 
factors in y are allowed to vary. The third estimates a meaning- 
ful population variance if the assumption of no correlation is 
completely fulfilled. For a? = no?, only if the samples from 
which o2, is taken are completely random ones, of the same size, 
and drawn according to the laws of chance upon a whole homo- 
geneous population. It is these two assumptions—homogeneous 
population and samples of equal size—that limit the analysis of 
variance technique to the null hypothesis and to tables with 
columns all of the same n. In practice, adjustment is made for 
unequal columns, but that is a rough adjustment without strict 
mathematical warrant. 


THE TEST OF SIGNIFICANCE 

Now even if the classes (the columns) in our table differed. 
from one another and from the whole distribution only by chance, 
these three estimates of the population variance would differ 
somewhat merely by reason of fluctuation in sampling. But to 
the extent to which there is present some law which brings it 
about that the classes differ materially in mean score, to that 
extent the population variance estimated from the means of 
classes will be large in comparison. with that estimated from 
within the classes themselves. When the difference is small, 
chance fluctuation can plausibly explain it; but when the differ- 
ence becomes great, it cannot be plausibly attributed to chance. 
The formula developed by Fisher for testing the divergence of 
these estimates of the population variance does not, however, 
involye subtracting the variances but rather dividing one by 
another, This is a more sensitive and precise way of measuring 
divergence than subtraction would be. 
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“The test assumes the null hypothesis; if the classes were really 
all alike—“belonged to the same homogeneous population ”— 
what would be the probability of getting in a sample so great a 
divergence as the one we have in hand in our sample? The 
answer involves a derivation that is merely an extension of the 
ones for Student’s ¢ and for Pearson’s x2. We show something 
of it in Chap. XIV. If independent samples are drawn from an 
infinite homogeneous parent population, these samples will differ 
in variance. The chance of obtaining a given sample can be 
stated in terms of probability, and the probability of obtaining 
simultaneously any two or more samples is the product of the 
probabilities of obtaining them separately. By stating mathe- 
matically the probability of obtaining two variances simul- 
taneously and then integrating, it is possible to determine the 
probability of obtaining two variances which diverge from each 
other by a given amount even though both samples arise from the 
same parent population. The process is inherently simple and 
straightforward, but the necessity of successively integrating ¢ 
functions involves some mathematical dodges and makes the 
arithmetic laborious. , For this reason it is not feasible to deter- 
mine at each application the probability that two variances as 
divergent as the ones in hand might have arisen by chance from 
the same parent population. So Fisher has tabled these proba- 
bilities for certain values. They must be tabled in terms of the 
size of the two samples as well as the extent of divergence between 
the estimated variances, Fisher tabled these in terms of a 


function he calls z, which is 5 log. a and also equals 
2 


(loge sı — loge 82) = 4(log, 8? — log, s2). 


But Snedecor tabled s?/s}, which he designated F, because that 
is the function obtained directly from the calculations and thus 
saves looking up log values. Many people believe Snedecor’s 
table is the most convenient in use.! s? is always to be taken as 
the larger of the two variances. A fundamental condition of the 


1 We do not include in this volume tables of F or of z, because we believe 
that the research workers for whom we are writing should usually employ 
the e technique described in our preceding chapter. The e technique tells 
all that analysis of variance tells and more. We give tables for the dis- 
tribution of &. Those who wish to use the F and z tables can find them in 
other books, 
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test. of significance is that the two estimates be independent. 
This makes it necessary to compare the variance estimated from 
the means with that estimated from the classes, since the other 
pairs of variances are correlated. Other comparisons can be 
made by adjusting for the element of correlation. 


EXAMPLES OF ANALYSIS OF VARIANCE 
We shall now give a simple example of analysis of variance. 
Table XXIX displays the number of pounds of milk given in a 
month by six cows of each of five breeds as taken from the records 
at Pennsylvania State College. The problem is to determine 


Taste XXIX.—Novmper or Pounps or Minx Grven BY Sıx Cows or EAcH 
| or Five Breeps IN 1 Monts Ar PENNSYLVANIA Stare COLLEGE 


| Breeds 


Cow No. Bown 
Holstein Jersey Guernsey | Ayrshire Series 
1 1,562 914 926 1,080 1,237 
2 1,897 920 700 1,231 1,246 
3 1,559 1,147 831 1,347 1,058 
4 1,594 712 989 999 1,112 
5 1,535 702 819 1,375 1,013 
6 2,498 727 904 1,009 1,095 
Motalstccees yas 10,645 5,122 5,169 7,041 6,761 
SOUR Wd Seg FATE, ISEE Be Beds A E e 
dy for whole table, 34,738 
Dy? for whole table, 44,702,480 
La — RB BYe — S5y _ (5)(261,556,272) — 34738" _ 
nE — 9)? = Gi = 66) 3,368,424 
2— SEa 
zzy — 9)? = nddy' = Zy _ 6,658,608 = 1,109,768 
— = 2 
zy -0 aban: pay" £ ee 34,738? _ 4,478,192 
_ n2(Je — 7)? _ 3,368,424 _ 
at ate ss BS 842,106 
_ Sz(y — 7)? _ 1,109,768 _ 
Ci Recs grey ear ace TE = 44,390 
i a SATS 102 — 154,420 


(Note that nE(ge — 7)? + 22y — J)? = ZY — 7)* This serves as 
a check on the arithmetic.) 


‘ 
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whether chance fluctuation of sampling alone could explain the 
observed differences among the breeds while in reality all the 
breeds are alike in milk production (belong to the same homo- 
geneous population) or whether this null hypothesis is untenable. 
Tf one has available a calculating machine, it is most convenient 
to obtain the needed sums of squares by the formulas given at 
the foot of the table, which are algebraic equivalents of the basic 
formulas. Making the calculations indicated, we get the follow- 
ing as our three estimates of the population variance: 


From means of columns, 842,106. 
From within columns, 44,390. 
From the total distribution, 154,420. 


Dividing the estimate from means by that from within columns 
to get F, we have 
_ & _ 842,106 


F = 3 = Fa e ~ 18-99 


In order to see what the probability is of obtaining so great 
an F merely by chance fluctuation, we enter Snedecor’s table 
with the nı for means equal to (k — 1), which is 4, and the nz for 
classes equal to (N — k), which is 25. In the column for 
nı = 4 and the row for ne = 25, we find 2.76 for the 5 per cent 
value and 4.18 for the 1 per cent. Our obtained F, 18.99, is 
much beyond even the 1 per cent value. This means that, if 
there were no true difference between the breeds in milk produc- 
tion, we would obtain so great a difference in variances much 
less than 1 time in 100. The difference in breeds is, therefore, 
highly significant and the null hypothesis, that the breeds might 
not differ, is refuted. 

We may show how this technique can be extended into educa- 
tional problems by the following examples. Dressel obtained the 


Degrees of | Sum of | Mean 


Source 
freedom | squares | square 
Between means of high schools............ 14 19,51 | 1.393 
Within high schools... 1... A wee 795 | 492.79 | 0.6197 
TAEA chai doin Wo ats fem teehee 809 | 512.30 | 0.6333 
We 1.393 = 2.25 
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following data on the college grades of 810 students coming from 
15 different high schools. The problem was to ascertain whether 
different high schools differ significantly in the degree to which 
their graduates succeed in making good college grades. 

We do not find in Snedecor’s table an entry for 809 and 14 
degrees of freedom, but the F for 1,000 and 14 degrees of freedom 
is 1.70 for the 5 per cent point and 2.09 for the 1 per cent point 
while for 400 and 14 degrees of freedom the entry is 1.72 for 
5 per cent and 2.12 for 1 per cent. So our obtained 2.25 is 
beyond the one to be expected in even 1 per cent of the samples 
on the basis of chance fluctuation. So the differences of means 
of high schools cannot be reasonably attributed to chance; 
the high schools differ significantly in respect to the success 
of their graduates in college. If Snedecor’s table of F is not 
available and the worker wishes to look up the significance 
from Fisher’s z table, he must obtain 


= 4 log. F 


which in this application is $ loge 2.25 = 0.40546. If a table 
of natural logarithms is not available, z can be obtained from a 
table of common logarithms as follows: 


z = (2.302585 logio F) = 1.151294 logio F 


which the reader will find by verification to be for this application 
also 0.40546. Fisher's z table will then give interpretations 
entirely consistent with Snedecor’s F table for the same number 
of degrees of freedom. 

So high schools are found to be significantly different in 
respect to the success of theirstudentsin college. Butintrinsically 
the relation is low; the significance is high because the population 
is large. We can get a measure of the strength of the relation 
by computing e, as explained in our previous chapter. This is 
very easily done from the above data. It involves merely 
dividing the population variance estimated from “within high 
schools” by that estimated from the total, then subtracting 
the quotient from 1.00. 


0.6197 
2) os i 
eé=1— 0.6334 ~ = .0214; e y .0214 = .144 
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Tf we have computed e instead of F, we can make the significance 
test by referring to our Table XLVII, page 497. Here, for 
N — k = 1,000 and k — 1 = 14, the e at the 5 per cent point 
is .010 and that at 1 per cent is .015, while for N — k = 400 and. 
k — 1 = 14 the 5 per cent value is .024 and the 1 per cent .036. 
This tells precisely the same story regarding significance that 
the F test or the z test tells: the e could not reasonably be 
attributed to chance fluctuation of sampling. Because e shows 
the strength of the relation as well as its significance and since 
the test of significance of e? gives outcomes identical with those 
for F or z, we publish only the table for e and not those for F 
and z. 

As another example we shall use some data a part of which 
is from a study by Thorndike and a part hypothetical because the 
necessary details are lacking in Thorndike’s report. A test of 
mental ability was administered to 4,540 subjects who had taken 
different combinations of courses in high school, making nine 
different curricular groups. The problem was to determine 
whether the several curricula differed in the effectiveness of their 
training as measured by this test. 


Mean 
squares | square 


Source of free- 


Between means of curricular groups........ 


38,517 | 4,815 
Within curricular groups............ 


14,544,510 | 3,210 _ 
14,583,027 | 3,213 


F = 1.50 


For 8 and 1,000 degrees of freedom an F of 1.89 stands at the 
5 per cent level and 2.43 at the 1 per cent level, while for 8 
degrees and infinity the 5 per cent point is 1.88 and the 1 per 
cent is 2.41. So our F of 1.5 would arise by chance fluctuation 
more than 5 times in 100. There is, therefore, little promise 
in the hypothesis that the several curricula differ in training 
value. Nine curricula of these types which do not differ in 
training value in the infinite population could reasonably often 
give as large differences in a sample of our size as the ones we 
have in hand. 
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The reader may wish to try the e test on this problem as an 
exercise. 


ANALYSIS OF VARIANCE INTO MORE THAN TWO PARTS 


We may set up the following identity, for one individual’s 

score: 
U -— 7) = (ge — 7) + g — 7) +U- Ie- i +9) 

The ğe stands for the mean of a column and the g, for the mean of 
arow. ‘The identity involves merely adding certain values to the 
quantity at the right and then subtracting them, so as to balance 
the equation. If, now, we square and sum for all individuals in 
the sample, we shall get the sum of the squares of the four quanti- 
ties in the several parentheses plus a series of cross products. 
But the cross products all vanish, because the total of the sums 
from them is zero. We are thus left with the following expression 
(into which we have inserted an extra parenthesis for later 
reference). 


(B) Wy — 7)? = 242g. — 9)? + VB, — 9)? 
+ D2I(y — 9.) — G — WP? 


The first two terms at the right of the equality sign are the 
sums of squares “between classes,” like the ones we met above; 
only, we have both the sums of squares of means of columns and 
those of means of rows. The term in brackets is a residual 
remaining in the total variance beyond the two sums of squares 
from means of columns and of rows. We have already shown 
that population variances can be estimated from D5(g. — 9)” 
and X2(ğ, — g)? by dividing by the appropriate number of 
degrees of freedom, viz., one less than the number of columns 
and one less than the number of rows, respectively. It can 
also be shown that a further estimate of the population variance 
can be made by dividing the residual by the appropriate number 
of degrees of freedom, which Irwin! proves to be in this ‘case 
(k —1)(n — 1). Thus we have 

Between columns, >=(g- — g)?, (k — 1) degrees of freedom. 


Between rows, >>(g, — g)?, (n — 1) degrees of freedom. 
Residual, =(y — Je — yr + 9), (k — 1)(n — 1) degrees of freedom. 


1Irwin, J. O., “Mathematical Theorems Involving Analysis of Vari- 
ance,” J. Roy. Statistical Soc., Vol. 94, p. 290. 
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When grouped in one way, as shown in (B), the entries in 
the brackets give, as close examination will show, the within- 
the-class sum of squares when the deviations are themselves 
taken as deviations from the means of the rows; and, when 
grouped in another way, the sum of squares within the other 
class as deviations from the means of columns. By taking the 
deviations in the columns, thus, from the means of the rows, 
the effect of the gross differences in rows is removed, and we have 
the residual variance in columns with the row factor held con- 
stant. If the population variance estimated from the residual 
is then compared with the population variance estimated from 
the means of columns, by the methods previously discussed in 
this chapter, evidence can be obtained regarding the departure 
of the column variation from chance with the row factor held 
constant. If the columns are independent of one another, as in 
our example where the Jerseys and the others were selected 
entirely at random regarding the Holsteins, the outcomes from 
this method will contain no new information; the significance 
will be the same except for random variation. If the entries 
in the columns (families) are matched in some manner, as by 
putting on the same row cows equally far along in gestation, the 
significance may be affected considerably. If, even with match- 
ing, there is no jntercorrelation except zero among the columns, 
the significance will be unchanged. But, if there is positive 
intercorrelation, the residual will be decreased and the signifi- 
cance of the differences between families thereby increased. If 
there is negative intercorrelation, the matched group arrange- 
ment will yield a larger residual variance and a lower reliability 
than the random one. 

We shall illustrate this in the table below. The table was 
adapted from data given by Snedecor on the influence of certain 

Tase XXX.—Ytevp or POTATOES UNDER DIFFERENT FERTILIZERS 


` 3 4 5 Total 
1 A423 | E317 | D323 | 1,835 
2 E398 C337 B447 | 2,029 
3 D425 | A389 | #449 | 1,969 
4 C404 B347 | A234 | 1,681 
5 B412 | D432 C386. | 1,944 
Total... 2,062 | 1,872 | 1,839 | 9,458 


hi 
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` fertilizers on the yield of potatoes. The table contains several 


features of which we wish to make use later and which we ignore 
here. One of these features is the prefacing of each entry by a 
letter; ignore that for the present. Nor is it necessary for 
our present purpose that the arrangement have the same number 
of rows as columns. It could, for our present purpose, be a 
five by six, or any other rectangular arrangement. Our only 
present requirement is that the items in a row belong together 
as a class—as the same side of a field, pigs of the same age, or 
teachers employed in the same city. 

The outlay in Table XXX represents a field divided into 
five strips (rows) and subdivided into 25 blocks by strips in the 
perpendicular direction. Let us say that the rows represent 
north-south divisions of the field into strips which may differ 
in fertility, and the columns represent the east-west orientation. 
The numerical entries represent the average number of bushels 
per acre in the blocks. Our first concern is to find whether 
the field is homogeneous in productiveness. Test first for east- 
west homogeneity, then for north-south homogeneity. This is 
done just as in our previous example. We have the following: 


Source Sum of squares | Degrees of freedom | Mean 


POCA ets foci. otter 73,657 (N —1) = 24 3,069 
Between columns. . ike 26,850 (k-1)= 4 6,712 
TRGBIAUAL E i sie a bin riare brn oft 46,807 (N — k) = 20 2,340 


= 6,712 - at 
F= 2340 7 2.87. P = 5 per cent 


For north-south (rows) as follows: 


Mean 
3,069 
3,758 
2,931 
3,758 r 
F= 201 = 1.28. P > 5 per cent (P is greater than 5 per cent) 
; 


By this test the field appears to differ somewhat in productivity 
as we go from east to west; the F for columns is exactly at the 
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5 per cent level, meaning that, if in the true population they did 
not differ, we would have obtained so great a divergence between 
our estimates of variance only 5 times in 100. But in the north- 
south direction heterogeneity is not established, since the F 
stands much below even the 5 per cent level and means that we 
would get so large a discrepancy considerably more than 5 times 
in 100 merely by chance fluctuation. 

But now we shall apply the technique discussed in the para- 
graph preceding the table; we shall hold rows constant in pro- 
ductivity and test the columns for homogeneity. Then we shall 
hold columns constant and test the rows for homogeneity. 

In the above procedure the sum of squares in our residual was 
the total sum minus that for “between columns” while we were 
testing columns, and this same total minus that “between rows” 
when we were testing rows. But here the residual sum of squares 
will be, as inspection of Eq. (B) shows, the total less the sum of 
the squares between means of columns and.that between means 
of rows. That is, the residual is 


73,657 — (26,850 + 15,034) = 31,7/3. 


This residual is the same for testing both rows and columns 
and is always most easily obtained by subtraction from the 
total. The degrees of freedom for this residual are now only 
(k — 1)(n — 1) = (4)(4) = 16, so that the mean is 1,986. So 
we have for columns 


6,712 Pi ter than 1 
ia 1,986 = 3.38. 1 <P <5 percent ; A ESETE 5 
4 per cent) 
For rows, 
3,758 
F= 1,986 = 1.89. P > 5 per cent 


Evidence of lack of homogeneity in the patches is increased 
by this added element of control, in respect both to rows and 
to columns. In the case of the columns it reaches a point 
which gives fairly conclusive evidence that the strips differ in 
productivity. 
THE LATIN SQUARE 

We now introduce a third element of control, for which 

the letters preceding the yields entered in Table XXX were 
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employed. The letters stand for different fertilizer treatments. 
Each of five fertilizers, represented by the letters A, B, C, D, E, 
respectively, was used in five blocks scattered through the field. 
The particular layout of the field now becomes important to us. 
We observe that each replication of a fertilizer treatment occurs 
in a different row and a different column, one replication in 
each row and one in each column, This necessitates that the 
layout be square, a k by k table with k experiments each repli- 
cated k times. This arrangement is called the Latin square. 

We want now to test whether or not the fertilizers designated 
A, B, C, D, E, respectively, affected the yield of potatoes differ- 
ently when the productivity of the 25 blocks is equated in respect 
to both rows and columns. We want to get a residual sum of 
squares as the “experimental error” that will be freed from the 
systematic contribution of differences in fertilizers. That sug- 
gests that we equate these scores by subtracting (algebraically) 
from each fertilizer score the difference between the mean of the 
class to which it belongs and the grand mean. This is another 
way of describing a process exactly similar to the one to which we 
resorted on page 341 when we were providing for the analysis of 
variance into three parts. Hence in the residual we subtract 
from the expression on page 341 such a differential and, to 
balance the equation, also addit. Calling the mean of a fertilizer 
class J, we have 
Y-D=G-D+G-D+G-1 +iy-H) 

— [g — 9) + Gr — DI} 
If, now, we square and sum, all cross products will sum to zero 
if the fertilizer treatments are so arranged that one replication 
occurs in each column and one in each row, as happens in the 
Latin square. In other random arrangements they may approxi- 
mately sum to zero but there is no mathematical assurance that 
they will do so. Thus we have left 
Iy — 9)? = Elge — 7)? + VG — ) + 2G — 9)? 
+ 2y -e — Ge +9 — i +9)? 
To the two sets of sums of squares between classes we add a 
further one, that of the means of fertilizer classes. The residual 


now is reduced and is best obtained by subtraction from the 
total. If the conditions named above have been fulfilled, a 
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population variance can be estimated from the new residual by 
dividing the sum of squares by the appropriate number of degrees 
of freedom. The numberof degrees of freedom is k — 1 less than 
before, where k is the number of replications of the experimental 
factor.1 In this case the degrees of freedom are 16 — 4 = 12. 
Summing by classes the scattered scores for fertilizer treatments 
we get the following: 


Sum | Mean 


1,767 | 353.4 
1,951 | 390.2 
1,910 | 382.0 
1,940 | 388.0 
1,890 | 378.0 


adawe | 


The sums of squares between rows and between columns we 
had before. We can easily calculate the sum of squares of the 
means of the fertilizer groups from the data given just above. 
We get the new residual sum of squares by subtraction from the 
total. Bringing all these together we have the following: 


Sum of | Degrees of 


Source Mean 

squares | freedom 
PORA e en SASSY OAR Tb 73,657 24 3,069 
Between columns, 0 26,850 4 6,712 
Between rows....... 15,034 4 3,758 
4,347 4 1,087 
27,426 12 2,285 


In order to test for reliability, we now divide each population 
variance estimated from between classes (i.e., from means of 
classes) by the one estimated from the residual and have the 
following: 

For columns 


F =. = 2.94. P > 5 per cent 


For rows 
3,758 
P = 2,985 = 1.64. P > 5 per cent 


1 Fisuer, R. A., Statistical Methods for Research Workers, 7th ed., p. 276. 
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For interpreting F, the table must be entered with 4 and 12 
degrees of freedom. 

For fertilizers the mean from between classes- is actually 
less than that from within the classes. This indicates that the 
fluctuation from class to class is less than that which would 
ordinarily come from chance sampling. So the fertilizers evi- 
dently have no differential effect that this experiment determines 
—amless it should exert some influence to make the average of 
the classes alike without simultaneously making the individuals 
within classes alike, which is extremely improbable. It is, 
therefore, highly improbable that there is a real effect rather than 
a chance one. But we shall look up its probability anyway in 
the F table. We must always divide the larger by the smaller 
variance, since the distribution would be of the same shape for 
negative deviations as for positive. 

F= E = 2.10. P > 5 per cent 
For 12 and 4 degrees of freedom an F of 5.91 stands at the 5 per 
cent point. So an F of 2.10 could easily come about by chance. 

It is possible to arrange a Latin. square so as to admit still a 
further factor. Then each cell will contain a score according to 
one classification designated by a Latin letter and the same score 
according to another classification represented by a Greek letter, 
so that each cell entry will be prefaced by two letters. This is 
called the Greco-Latin square. The arrangement must be such 
that each Latin letter appears once in each row and in each 
column, each Greek letter once in each row and each column, and 
each Latin letter once with each Greek letter. This permits the 
analysis of variance into four parts besides the residual (error). 
The mathematical derivation of its formula would follow along 
the same lines as the one we gave for the Latin square. The 
number of Greco-Latin squares that can be set up is narrowly 
limited; out of all the possible Latin squares only a small fraction 
can be arranged as Greco-Latin squares. The interested reader 
may pursue this further in Fisher’s Design of Experiments, 
pages 90 to 93. 

The analysis of variance into three parts (between rows, 
between columns, and residual) is not limited to the Latin square, 
though it is limited to a rectangular table in which rows consist 
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of scores matched on one basis and columns consist of scores 
matched on another basis. The analysis into further factors 
(without subclasses) is limited to the Latin square or to some 
other equally effective method of randomizing the plots. It is 
clear that such an experimental design as the Latin square fits 
agricultural research especially well. A field is likely to differ in 
fertility even in closely proximate positions. The division of the 
field into blocks so placed as to sample all parts of the field in a 
systematic way is an excellent scheme for making probable 
equally favorable conditions for all the experimental factors. 
Sometimes it may be useful in other types of research (see Exer- 
cise 4, page 359). But research workers in education, psy- 
chology, and sociology, for whom chiefly we are writing, are 
likely to find fulfillment of its peculiar replication requirements 
cumbersome and impractical. We shall later have something 

` to say about borrowing research techniques which were designed 
for one type of problem and trying to fit them into another. 


ANALYSIS WITH SUBCLASSES 

Further complexity is introduced into analysis of variance 
when each class is divided into subclasses and certain comparisons 
are made involving the subclasses. This is really only an 
aggregation of elemental problems which, taken singly, are 
precisely the same as the ones we discussed above. It is a 
case of “wheels within wheels.” We think it best, at least for 
novices and probably for all workers, to attack these phases one 
by one instead of driving them all abreast. Always one should 
realize that his problem consists in facing an aggregate of classes 
which may differ more or less in respect to the position of their 
means, and his question is whether these means differ more than 
the variability within the classes would justify on the basis of 
chance. Sometimes there will be a set of subclasses viewed 
within a larger class which itself is only a part of the whole; 
sometimes there will be a number of such sets of subclasses 
averaged together, But always the investigator should approach 
his problem by putting a certain question to it—a certain 
hypothesis—and manipulating his data in such manner as to 
give him the answer to that particular question. Each question 


1A good example of such step by step analysis will be found in L. H. C. 
Tippett, The Methods of Statistics, Williams and Norgate, 2d ed., 1937, pp- 
218-226. 
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will demand bringing together certain classes and comparing 
them in certain ways. If he keeps his eye clearly on the means 
he is examining for fluctuation and the classes of which they 
are the means, he will have no difficulty with such issues as 
what sums of squares are required or what are the numbers of 
degrees of freedom. 


DEGREES OF FREEDOM 


In order to remove some of the sense of magic which, for the 
layman, centers about this concept of degrees of freedom appro- 
priate for estimating a population variance, we shall show Fisher’s 
derivation for the case involving the estimate from “within 
classes.” It can be shown that the distribution of estimates 
of variance from samples is such that the probability that an 
estimate will fall in the range ds; is 


np—3 _ npp? 


Cpo» (s2) Z e 2 d(s?) 


where np is the population of the array in the sample and C, is a 
quantity depending only on np. Since the columns of a correla- 
tion table (or the classes in any analysis of variance setup) 
are assumed to be independent, the probability that all the 
observed values of s will fall in assigned ranges is the product of 
all such probabilities for all the Æ columns or classes. The 
optimum estimate of ø is the value of e which will make the joint 
probability a maximum. In order to find this value, we differ- 
entiate the function with respect to ø, equate the derivative to 
zero, and solve for ø (see pages 10 to 15 if necessary). We shall 
do that with the expression here made up of the products from 
the columns. But first we shall take logs, then differentiate. 
Remember that the log of a product is a sum of logs. Let P be 
the joint probability. Then 


log P = S(log Cp) — [2(m» — 1)] loge 
+ D(mp — 3) log sp — #(2n,st)o~? + E log a(s?) 


The reason no log appears in the next to the last term is, of 
course, because it would be log. e, which equals 1. Taking 


1 Fisnmr, R. A., “The Goodness of Fit of Regression Formulae,” J. Roy. 
Statistical Soc., Vol. 85, pp. 599-600 (1922). 
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derivatives with respect to ø (see page 22 if necessary), we have 


uf I 
alde L i(n — I> + [nelga 


Equating this derivative to zero and solving, we get 
[2 (np — 1)? = Zs 
so that the value of o? which will make P a maximum, which 
we shall designate ô’, is 
Inps? 
Z(n» — 1) 


But 3(np — 1) = Bn, — 31 = N — k and så = 2(y — Gp)?/M9; 
whence 


p= 


oes Z2(y — Gr)? 
N-k 


where the double summation is over all the cells in all the columns. 

Since k is the number of classes (columns), it is clear that 
the number of degrees of freedom when estimating the population 
variance from within classes will be the total N of the sample 
utilized less the number of classes. If the classes themselves 
all have the same n’, then (N — k) = (kn’ — k) = kw — 1). 
That the number of degrees of freedom for the total is (N — 1) 
was shown very simply on page 70. In the case of means 
(between classes) the divisor is (k — 1), which involves again 
the same principle as the (NV — 1) for which we have just cited 
the reason. Irwin has shown that the case of analysis of variance 
into more than two parts reduces algebraically to the same basis. 
Determination of the number of degrees of freedom in regression 
and in other applications follows equally logically with the proper 
adaptation for the type of distribution involved. The reader 
has, of course, noticed that the degrees of freedom are additive, 
which fact follows from some complication of the algebraic 
expression (N — k) + (k — 1) = (N — 1). Wehavealso shown 
that, because the cross products vanish, the sums of squares 
are additive and each sum of squares corresponds to the appro- 
priate degrees of freedom. But we believe that workers will 
perform their analyses much more safely and intelligently if, 
instead of depending upon this mechanical principle, they 
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will, as said above, get their eye on the classes they are comparing 
—-with the variation of the means of these classes and the varia- 
bility within the classes—and picture to themselves what 
they are doing, in the more fundamental sense discussed in this 
paragraph. i 


FURTHER RAMIFICATIONS OF ANALYSIS OF VARIANCE 


The conventional treatment of analysis of variance includes, 
besides a much fuller account of analysis with subclasses, schemes 
for adjusting for classes of unequal populations, for interpolating 
scores, for correcting for covariance, for “confounding” replica- 
tions, ete. It is beyond the scope and purpose of this book 
to pursue these topics. The interested reader can find them 
treated in such books as Snedecor’s Statistical Method, Tippett’s 
The Methods of Statistics, Rider’s Introduction to Modern Statis- 
tical Methods, Fisher’s Statistical Methods for Research Workers, 
and Fisher’s Design of Experiments. 


THE SPECIAL CASE OF TWO CLASSES 


Analysis of variance may be applied, with interesting results, 
to the special case of two arrays. This, the reader will observe, 
involves the question whether the means of the two classes differ 
significantly and is, therefore, the familiar case of the significance 
of the difference between two means, assuming the null hypothesis 
which we treated on pages 177-179. This, according to the 
technique explained there, is expressed by the relation 


5 i—i 
sv 1/m) + (l/m) 


where sis the population variance estimated from the two arrays 
jointly. In terms of analysis of variance we have 


(C) t 


(D) Between classes, Niı(ēı — 2)* + Ne(ā2 — 3)? with (k — 1) 
= (2 — 1) = 1 degree of freedom 
(E) Within classes, (zı — 41)? + E(za — z)? with (Ni + Ne 
— 2) degrees of freedom 
Now the mean, 7, is 
Pre Nid: + Não 
Nit Ne 
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Substituting this in (D) and simplifying, 

z -N2 5 Na 
Nı (z MA Nid + ~] ENG (z “Si Nii + xe) 


Ni+ Ne Ni + Ne 
IUN: Ga + Not: — Nid — Mee’ 
: N: +N: : 
Nike + No®. — Niki — aa) 
+m( Ni + Ne 
fu (G1 — 2)? (G — 41)? 
= MiNi N, F N FN NNN, EN FN? 
—_ (Ni + N2)NiNo(%1 — &2)? a NiN2 (BL — z)? 
(Mi + Ns)? (Ni + Ne) 
(F) (21 — 2)? 


Now Eq. (F) divided by its number of degrees of freedom 
(which is 1) gives s* as estimated from between classes, and Eq. 
(E) divided by its number of degrees of freedom (which is 
N..+ Ne — 2) gives s? as estimated from within classes. Whence 


®@ je (ži = 3)? 


1 1 
s (7 +7) 


The sè is calculated in the same manner as shown on page 
178. So comparison of (C) and (@) will show that F is precisely 
t. We can test the significance either by looking in the F or 
z tables for testing the relation of estimates of variance or by 
looking in the ¢ table for means. We must enter the F table 
or the z table with nı = 1 and nz = N — 2. If the N’s are even 
reasonably large, we may use the normal-curve tables for t. 
Otherwise we enter the special table for Student’s distribution 
with n = (Ni + N: — 2). 

Thus analysis of variance can be employed to test the signifi- 
cance of the difference between two means; it will give exactly 
the same result as the conventional difference of means technique 
when the null hypothesis is assumed—as, indeed, it must if both 
methods are correct and mathematics continues to be consistent. 

There has begun to be some use of this technique in educational 
research.t But we can see no advantage whatever in it. The 


1 See Bonn, Eva, “Reading and Ninth Grade Achievement,” Teach. Coll. 
Contrib. Educ. No. 756, 1938. 
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analysis of variance technique gives no added information 
whatever over the difference of means technique; its arithmetic 
outcomes are precisely the same. On the other hand, it 
suffers from the fact that the tables of z and of F are much less 
complete for this case than are the tables for t. It suffers 
particularly from the fact that its relations are less clear to 
the layman, hence making its appeal as magic. Of course, 
it is by its nature limited to confirming or refuting the null 
hypothesis and cannot’ work into the more general case repre- 
sented by formulas (90) and (95), to say nothing of all the other 
more positive techniques of classical statistics. The fact that 
analysis of variance technique can be extended to cover the 
case of two classes is of academic interest in showing that the 
analysis of variance technique is general. But the fact that it 
can be used in this application is no reason why it should be 
used when more effective alternative techniques are available. 


THE RELATION OF ANALYSIS OF VARIANCE TO e 


In our preceding chapter we explained and extended a statistic 
recently developed by Kelley—the unbiased correlation ratio 
which he named e. Epsilon involves much the same calculations 
as analysis of variance. The F and the z tests employed with 
analysis of variance do not directly indicate the strength of the 
relation that is present, but only its reliability. Analysis of 
variance, that is, tells only the negative side of the story, limiting 
itself to confirming or refuting the null hypothesis. Epsilon, 
on the other hand, shows in language with a uniform meaning 
what is the strength of the relation that is present and at the 
same time permits an “exact” test of its reliability. There is a 
functional relation between ¢ and F, as follows:+ 
(N — ke? + (k — 1) 

(k —1)G — €) 


where k is the number of classes and N is the whole population 
of the sample. Whereas e is the same for a given strength of 
relation regardless of the size of the sample or the number of 
classes into which it is divided, F varies with the size of the sample 
and the number of classes for a given strength of relation, as 
inspection of the above formula shows. Epsilon has a meaning 


1See pp. 421-422. 


F= 
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as uniform as that of r, with which, in fact, it becomes identical 
if the regression of classes along an ordered axis is rectilinear. 

The relation of e to analysis of variance is most obvious 
when the analysis of variance is into two parts—between classes 
and within classes, which is the usual form. But it obtains in 
the same manner when variance is analyzed into more than 
two parts, as discussed on pages 341 to 344. For this further 
analysis merely adjusts the scores so as to free the variation in 
the residual from additional controlling factors, whereupon the 
residual scatter normally becomes less. By definition of the 
squared correlation ratio it is merely 1 minus the residual 
variance from the means of classes divided by the total variance; 
and e can just as properly turn on a corrected residual variance 
as F can. 

But in this application we cannot compute ein terms of the 
within-class variance over the total variance; we must, instead, 
compute it in terms of between classes (i.e., means of classes) 
and within classes (7.e., the residual). But in the case of columns 
of equal »’s (which must always obtain where variance is to be 
analyzed into more than two parts), this is sufficiently easily 
done; Straightforward algebraic manipulation of the funda- 
mental equations gives us for this case 


Noh — (k — 1)se 
Non + (df)s¢ 


where (df) is the number of degrees of freedom appropriate to 
the residual. The of, here is the variance of the means of classes 
in the sample and is very different from the population variance 
estimated from the means. But s? is the population estimate 
from the squares within classes, obtained by dividing the sum of 
squares by the degrees of freedom in the customary manner. 
We can, however, put both of these variances in terms of popula- 
tions estimates if we wish. Letting s?, be the population variance 
estimated from the means in the customary manner [?.e., by 
dividing n2(g. — 7)? by k — 1], we have . 


»_ &- 1h — De 
= G- hat Oe (186) 


Redoing by the epsilon technique the problem of the relation 
of columns to productivity in potatoes with rows and variety 


es 
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held constant (pages 345 to 348), we have 


e — £6,712) — 4(2,285) 
— 4(6,712) + 12(2,285) 


In our table an e of .361 stands at the 5 per cent point for 4 
and 12 degrees of freedom, so that the chances of obtaining an e° 
of .326 in this sample merely by chance are somewhat greater 
than 5 in 100, which agrees with the determination by the F 
technique. 

An illuminating outcome will follow by treating by the epsilon 
technique the case of varieties in that same problem. 


4(1,087) — 4(2,285) _ 
4(6,712) + 12(2,285) 


The eis negative. While no negative 7? can result from asample, 
negative es can arise by chance fluctuation from a true correla- 
tion of zero, or near zero. They arise only by chance, so that 
there is no use in looking up the reliability. The distribution of 
e is not symmetrical and our table does not give negative e?’s. 
But the negative values parallel roughly the positive ones. The 
.088 is far below the .361 which stands at the 5 per cent point for 
4 and 12 degrees of freedom. Our finding by this technique 
agrees, therefore, with what we obtained by the F test in analysis 
of variance. The e will be negative whenever the population 
variance estimated from means of classes is less than that esti- 
mated from the residual. 

Partial epsilon, discussed on page 326, parallels analysis of 
variance with subclasses. 

Prior to Kelley’s derivation of epsilon and our table of its 
distribution, the correlation ratio was of rather limited service. 
For the interpretation of eta was somewhat dependent upon the 
size of the population and the number of classes into which it 
was divided. Furthermore, there was available no “exact” test 
of the significance of n. But e is entirely free from these limita- 
tions. It has a completely uniform and standardized meaning, 
and our table for its distribution is exact.! In fact the e test 


326 


2 = 


—.088 


1 The term exact distribution is a technical term introduced into statistics 
by Fisher to mean that, in dividing the deviation of a statistic from a hypo- 
thetical value to get t, cognizance is taken of the fact that the divisor is not 
the true population variability but an estimate of it. For small samples 
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of reliability gives precisely the same results as the F test of 
analysis of variance for a given problem. The fact that e has 
a uniform positive meaning in addition to its ability to make an 
exact test of the null hypothesis should give it a useful place in 
statistics. 

It is true that, traditionally, 7 and e have usually been thought 
of as belonging to those situations in which the classes could be 
quantitatively ordered ón the « axis, though this has not been 
uniformly the case. But it is to be noted that there is nothing 
in either the 7 or the e formula that depends upon the x placement; 
the correlation ratio is wholly independent of such serial ordering. 
It is only if we wished to follow up the calculation of the correla- 
tion ratio by curve fitting that we would be interested in the 
serial ordering of the classes. Such added purpose is wholly 
independent of the e itself. Moreover, analysis of variance, also, 
would be wholly meaningless if the classes did not belong to a 
common quantitative series which could, hypothetically, be 
quantitatively ordered. When, that is, one analyzes the variance 
of a number of breeds of cattle with reference to milk production, 
it is on the assumption that there is some æ factor of which the 
different breeds have different amounts by reason of which the 
mean amount of milk produced differs from breed to breed. 
Apart from such z factor there could be no basis for comparison 
at all, any more than there could be a basis for comparing hoes 
with ideals. If it is true that analysis of variance must presup- 
pose the hypothetical possibility of quantitatively ordering its 
classes but need not stress the actual ordering and that e likewise 
permits such ordering but is not dependent upon it, there is no 
fundamental difference between the e technique and the analysis 
of variance technique in the types of situations to which they 
apply. 

After the e technique has shown the presence of some law and 
the extent of its strength, the next step is to study the nature 


this estimate is likely to be poor and the resulting distribution is more lepto- 
kurtic than the normal one. As the sample increases in size the estimate 
improves, so that s approaches ¢ and the exact distribution approaches the 
normal. While theoretically differing anywhere short of infinity, the differ- 
ences between the two distributions become negligible when N reaches a 
moderately good size, 
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of that law. This will probably take the form of trying to fit 
to the data several types of curves (see Chap. XV) and testing 
the goodness of their fit by «°. Or it may take the form of 
comparing means and other statistics (such as variability, 
skewness, etc.) between the classes taken in pairs and of testing 
the significance of the differences by such techniques as are 
described in Chap. VI. 


THE PLACE OF ANALYSIS OF VARIANCE IN RESEARCH 


Analysis of variance belongs as a first step in a major research 
where one wishes to make a rough preliminary test of his hypothe- 
sis in advance of going to the expense of the elaborate setup 
needed for a thorough investigation. An agricultural research 
worker, for example, has the hypothesis that different varieties 
of wheat may, in a given locality, yield sufficiently different 
amounts of crops to justify adopting one of them rather than 
others. His first step may be to make a comparison of all of 
them simultaneously, with rather small samples, randomized 
in some effective manner as in a Latin square.’ If he finds that, 
in this trial, the varieties differ no more than chance would 
explain, he abandons his hypothesis; or at least he gives it a 
second preliminary trial. But, if his hypothesis is confirmed 
and he finds that the varieties do differ significantly on the 
average (for analysis of variance always lumps its classes into 
an average), he is ready to proceed with the positive aspect of 
his research, He will then set up his experiment, or series of 
experiments, with a single variable and a large sample, -will 
undertake to determine which variety yields more than which and 
by how much of a differential, ete. Or an investigator in Educa- 
tion gets the hunch that teacher personality may influence the 
degree of introversion-extroversion of pupils. His first step may 
be to draw small samples of pupils from a half dozen teachers ina 
half dozen different cities, administer to them a test of intro- 
version-extroversion, and by the analysis of variance technique 


1 Of course, when he actually sets up his preliminary study, he formally 
puts the hypothesis the other way around: he tries the null hypothesis that 
there may be no difference. But a research worker is ordinarily led into a 
problem by a positively conceived hypothesis rather than a negatively 
conceived one. z 
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see whether there is any plausibility in his hunch. Tf there is, he 
will then proceed to the positive type of investigation, measuring 
the amount of differences and making tests of the probable limits 
of these amounts, correlating extroversion effects with certain 
measurable characteristics of teachers, etc. In this preliminary 
exploratory stage analysis of variance, with its limitation to 
refuting or confirming the null hypothesis, its adaptation to small 
samples, and its ability to test simultancously a number of 
variations in the experimental factor, may serve the purpose 
very well. It is especially useful in agricultural research, where 
the expense of securing large samples under experimental con- 
ditions is very great. But for the positive side of the research 
the investigator will need the standard procedures of classical 
statistics, such as correlation, curve fitting, and contrast of 
correlated matched groups. Constructive research is just ready 
to begin where analysis of variance leaves off. In the field of 
educational research we are now finding some investigators who 
make a showing of their findings in terms of analysis of variance 
when the size and character of their sample would permit them 
to make the positive rather than the merely negative presenta- 
tion. Sometimes we find them, after making the positive show- 
ing, also making a showing in terms of analysis of variance. That 
is precisely parallel to the behavior of an engineer who would 
first successfully construct his bridge and thereafter conduct an 
elaborate argument to prove that it would probably be possible 
to construct such a bridge. It is always pedantic to try to make 
forced use of statistical devices borrowed from another field 
when they only poorly fit. Statistical procedures are tools to be 
drawn upon only as needed for definite and well-understood 
purposes, and those tools are best which are not only most 
natural for the worker but also most readily understood by the 
reader to whom the findings of the research are to be addressed. 
The great historical contributions to statistics did not come about 
by the intention of the author to make a statistical formula; on 
the contrary, they were inventions devised for interpreting cer- 
tain baffling research problems with which the investigator was 
confronted in some conerete setting. It is such natural emerg- 
ence of procedures from the needs of the situation, rather than 
the imitative use of statistics, that should be the ideal toward 
which we work. 
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WHEN A HYPOTHESIS IS REFUTED 


Since analysis of variance has for its purpose the dismissal 
of hypotheses that fail to meet the test of statistical significance, 
it is fitting to say a word about when a hypothesis has been 
refuted. There is danger that research workers may interpret 
too literally and mechanically the preliminary evidence afforded 
by the technique as to when a hypothesis should be abandoned. 
If the F falls below the 5 per cent point, this means that there 
are more than 5 chances in 100 that accidents of fluctuation 
might account for one’s finding and that he must not be at all 
sure of any real differences among his classes. But, conversely, 
it means that there are, maybe, 85 or 90 chances in 100 (8 or 
10 to 1) that there are real differences which a better controlled 
study would reveal; and he might be quite unjustified in hastily 
giving up his hypothesis without further investigation. It is an 
error of one form to overreadily accept the conclusiveness of our 
findings; but it is an error of a second form to suppose that a 
hypothesis has been fully refuted when it has merely been brought 
below the level of certainty. 


Exercises 


1. H. L. Smith and M. T. Eaton give the data on the effect of drill in 
fundamental combinations upon ability to add eight digit exercises as shown 
in the table on page 360. No drill preceded test 1. Successive tests followed 
at intervals of one week with drill on the number combinations interven- 
ing. Thus the drill was cumulative throughout the period within which 
the drill was given. 

Does the analysis of variance show significant differences in means among 
the four tests: (a) when variance is analyzed into two parts? (b) When 
variance is analyzed into three parts, taking cognizance of the fact that the 
columns are positively intercorrelated? Speculate upon the meaning of the 
different outcomes you get by these two different procedures, : 

2. Compute and interpret e for this table. 

8. Would r be an appropriate statistic in the above problem? Would 
curve fitting? What is the relation among r, e, and curve fitting? 

4, A sociologist wishes to make a preliminary test of his hypothesis that 
nationalities in urban communities differ in respect to the time they allow 
to elapse before taking out their first naturalization papers. He suspects 
that the city in which they live and also the ecological zone within the city 
may be factors. With hypothetical figures (or real ones if you can get them) 
set up a Latin square to test this hypothesis. Lay off five cities as columns 
and, as rows, use zones between concentric circles centering about the down- 
town business district. How do you fit in the nationalities? 
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No. of columns added correctly 
Subject 
Test 1 Test 2 Test 3 Test 4 
1 11 15 20 17 
2 40 39 37 40 
3 38 38 35 40 
4 35 40 39 39 
5 22 29 37 34 
6 32 36 39 34 
7 23 24 23 26 
8 23 24 20 30 
9 10 16 18 17 
10 23 29 30 28 
11 25 28 27 31 
12 18 19 22 24 
13 22 27 34 33 
14 14 17 17 17 
15 21 30 29 29 
16 20 25 31 30 
17 34 33 36 37 
18 38 38 40 37 
19 81 39 38 40 
20 26 24 29 27 
21 23 28 26 31 
22 26 31 26 30 
23 24 25 28 26 
Means .. 25.2 28.4 29.2 30.3 


References for Further Study 


Visor, R. A.: Statistical Methods for Research Workers, Oliver and Boyd, 
Tth ed., 1938. (Slightly revised editions of this book have been pub- 
lished at intervals of about 2 years since 1924. It is the parent book 
in this field, but difficult to read.) 

: The Design of Experiments, Oliver and Boyd, Edinburgh, 1935. 
(A small book devoted to an exposition of how to set up experiments, 
primarily in agricultural research.) 

Irwm, J. O.: ‘Mathematical Theorems Involved in the Analysis of Vari- 
ance,” J. Roy. Statistical Soc,, Vol. 94, pp. 284-300, 

Rwer, Paur R.: An Introduction to Modern Statistical Methods, John Wiley 
& Sons, Inc., 1939. (Makes some attempt to give mathematical deriva- 
tions, but they are not very complete.) 

Snepscor, G. W.: Statistical Methods, George Banta Publishing Company. 
(The most complete popularized account of analysis of variance and of 
the other phases of Fisher’s statistics, Written chiefly for research 
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workers in agriculture. Indispensable to workers in that field and 
useful to others. It makes little attempt to show the mathematical 
foundations for the formulas, confining itself to an explanation of their 
use for persons of little statistical training.) 

Tirrerr, L. H. C.: The Methods of Statistics, Williams and Norgate, 2d ed., 
1937. (The clearest available explanation of analysis of variance with 
some attention to its mathematical foundations. In our opinion the 
best single book undertaking to popularize the Fisher statistics for 
persons of a moderate amount of statistical training. The book con- 
tains no tables.) 


CHAPTER XIII 
FURTHER METHODS OF CORRELATION 


In Chap, IV we treated the Pearson product-moment correla- 
tion technique and the Spearman ranks method, which latter is 
just a special algebraic adaptation of the product-moment 
formula, The Pearson product-moment formula is the best one 
to use where it can be applied. But many situations arise in 
which it would be desirable to employ a correlation technique in 
which this formula is, for one 
reason or another, not applic- 
able. In this chapter we shall 
set forth several alternative 
procedures, each adapted to 
some particular type of situ- 
ation. 


BISERIAL CORRELATION 

We sometimes have our data given in the form of two mutually 
exclusive categories in respect to one factor and in quantitative 
Scores in respect to the other factor. It is not difficult to develop 
a formula for the coefficient of correlation between the two 
factors under these conditions. In Table XX XT we display such 
measures from a study by Sones on the relation of size of family 
to the tendency of children to leave school before the age of 
eighteen. In column 2 is given the distribution of 200 children 
who remained in school according to the size of the families to 
which they belong, in column 3 is given a corresponding dis- 
tribution for 100 children who had left school, while in column 4 
the totals are shown. We want the coefficient of correlation 
between size of family and tendency of the children to leave 
school. We shall lay off our situation graphically on the accom- 
panying chart. AD is the straight line passing through the 
means of the two arrays. The slope of line AD is the regression 
coefficient and, when multiplied by the ratio of the o’s of the two 
factors, becomes the coefficient of correlation, Letting yz 

362 


Fig, 23. 
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stand for DC, yı for AB, £2 for OC, zı for OB, and b for the slope 
of the line of the means, we have 


This last term on the right comes from application of “alterna- 


Taste XXXI—DISTRIBUTION or CHILDREN WHO REMAINED IN SCHOOL 
AND OF CHILDREN WnHo LEFT SCHOOL BEFORE EIGHTEEN YEARS OF 
AGE, ACCORDING TO Size OF FAMILIES! 


(1) (2) (8) (4) 

No. children Remained Left Total 
in family in school school 

12 2 2 

11 4 3 vi 

10 4 2 6 

9 4 8 12 

8 20 3 23 

7 10 17 27 

6 24 12 36 

5 18 18 36 

4 30 10 40 

3 34 12 46 

2 34 10 44 

1 16 5 21 
Meann, piht eiai 4.57 5,31 4,82 


1 Sones, ELwoop, “A Study of One Hundred Boys and Girls, Sixteen to Eighteen Years 


of Age, Who Have Left School and a Similar Group Remaining in School,” master's thesis 
at Pennsylvania State College, 1933. 


tion” and “composition” (recall elementary algebra or geome- 
try). Now since r equals b multiplied by the ø ratio, 


The (y2 + y1) is the total distance between the means of the two 
distributions, hence (M,, — M,,). The (2 + 21) is the distance 
` between the means of the two parts into which the distribution 
of pupils in respect to persistence in school is divided. We 
may reasonably assume that, in respect to disposition to remain 
in school, pupils make a normal distribution, and we have already 
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learned that the mean of the tail of a normal distribution from 
the mean of the whole distribution is z/p, where z is the height of 
the ordinate of a normal distribution of unit area and unit 
standard deviation at the point of truncation and p is the propor- 
tion of the whole distribution in the tail (see page 289). There- 
fore, 7z equals z/p, and %, equals z/q, so that 


serie 22 (zq + zp) _ (p +q)z 
Gta) = (242) = pm o 


But (p + g) is the whole area of the distribution and, since we 
are dealing in terms of proportion, is 1. Substituting 1 for 
(p +9), (Z1 +) = 2/pg. 

7, = 1, since we assumed in calculating the z of the numer- 
ator that the x factor makes a normal distribution of unit area 
and of unit standard deviation. 
The standard deviation of the y 
factor can readily be found; it is 
the standard deviation of the scores 
constituting the sum of the two 
partial y distributions (here shown 
in column 4) and is to be computed 
in the customary manner, If the grouping is very coarse, Shep- 
pard’s correction should be made in the computation of this sigma. 
We are now ready to substitute in the ‘zy formula above the 
several equivalents just found, and we have 

(M,, — M,,)pq 


1 (Biserial coefficient of correlation) (187) 
a i 


Fig, 24, 


Let us now apply this formula to Sones’ data as displayed 
in Table XXXI. My, equals 5.31 and M, v, is 4.57. The propor- 
tion leaving school, p, is .33 while the proportion remaining is 
(1 — .33) or .67. The standard deviation of the distribution 
in column 4 is 2.57 and the z shown in our table (page 482) for a 
tail of .33 is 0.3635. Substituting these values in the formula, 
we have 


_ (5.31 — 4.57)(.333)(.667) _ 
Lite (2.57)(0.3635) TA 
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Soper! gives as an approximate value of the standard error of 
biserial r, provided g is not less than .05, the following: 


(Standard error of biserial r) (188) 


The probable error would, of course, be .6745 times the standard 
error. 

The assumptions involved in the biserial r formula should be 
carefully noted. No assumptions whatever are made about the 
shape of the distribution in which the quantitative scores are 
found. But the assumption is definitely involved that this 
distribution is not mutilated in such fashion as to change its 
standard deviation as compared with what it would be in a 
random sample of the total population drawn upon. Normality 
is assumed in the distribution in which the dichotomy occurs. 
It is also assumed that the whole, sample distribution is present 
and that the two tails fit together into a whole normal dis- 
tribution. In using the formula, there is great temptation to 
draw upon the upper and lower extreme tails and omit individuals 
from the middle of the distribution. If, for example, one is 
attempting to study by this technique the correlation between 
professional training and teacher success, it would not do to 
select 100 of the best teachers (constituting, say, the uppermost 
20 per cent of the whole teaching population) and the 100 poorest, 
for that would chop out the middle of the distribution and give 
7s much too high. A method for dealing with such widespread 
dichotomies is given later in this chapter. 

The biserial r is really a very promising technique for research 
in education and in the psychological and social sciences. The 
following are illustrations of a few types of situations in which it 
could be advantageously employed: 

1. Having athletes divided into successful and unsuccessful 
and having measurements in a number of traits possibly related 
to athletic success, find the correlation of each of these traits with 
success, r, 

2. Having teachers similarly divided, find the correlation of a 
number of factors with teacher success. 

f 1Sormr, Biometrika, Vol. X, p. 390. 
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3. Having a large sample of motion pictures divided into 
“good” and “poor” (or perhaps “above average” and “below 
average”) from the standpoint of excellence in the technique of 
dramatic art and knowing the financial returns from each of 
the pictures of the sample, determine the correlation between 
excellence in dramatic art and financial success. 

4, Having measures of certain temperamental traits for a 
sample of divorced women and for a corresponding sample of 
women who are not divorced, ascertain the coefficient of cor- 
relation between each of these temperamental traits and the 
tendency to be divorced. 

Evidently large numbers of such problems could be formulated 
in various areas of research, 


TETRACHORIC CORRELATION 


A second type of situation is where we have our data in both 
variables merely in the form of the number of individuals, or the 
proportion of individuals, in each of two categories. Thus we 
may have a total population of teachers divided into “success- 
ful” and “unsuccessful” (meaning above or below a certain 
dividing point in respect to success) and have information that 
a of the former have taken courses in pedagogy beyond 6 hr. and 
b of them have not, while of the latter c have had such courses 
beyond the 6 hr, and d have not. We 
wish to find what correlation exists 
between success in teaching and the 
taking of courses in pedagogy. We 
lay our data off in the form of a four- 
fold table, as indicated in Fig. 25. 
The dichotomic lines are KK and HH, 
while the means of the distributions 
lie at XX and YY, respectively. In 
order to make our case general, we are not assuming that the 
dichotomies are equal but are allowing the dichotomic lines to 
lie at distances h and k, respectively, from the means, 

As a foundation for approaching our problem, we must get an 
equation for the correlation surface where both of the two corre- 
lated arrays are assumed to be normal distributions, If z’ is 
the frequency of y scores at any particular value of z, then, 
according to our formula for the normal curve given on page 286, 


Fia. 25. 
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These y scores constitute a column which itself may be assumed 
to make a normal distribution with its mear on the regression 
line (assumption of rectilinearity of regression). The frequency 
of any y score in this column, when the y is measured from the 
mean of the column as origin, is 


PA 
Ne —¢ Baty, 
Oyy Ir 


where the N, is the number of individuals in the column and 
the op is the standard deviation of the column. We learned 
(page 113) that the standard deviation of a column in a correla- 
tion surface is 1/1 — r? times the standard deviation of the 
entire y distribution. The measurements of ye are taken, as 
said above, as deviations from the regression line. This point 
on the regression line for this column is r(¢y/cz)x distant from the 
mean of the whole y distribution, Therefore, in the case of any 


(B) z" = 


score, Ye = (v - ra 2), Making this substitution for the ya 
z 


and oy\/1 — 7? for oy, we have 


{heen N.: -ùz 

Si E Wits ERAT hg I 
Equation (A) represents the number of individuals out of a 
total population of N that are to be expected in a given column 
of a normal distribution, Hence the proportion of chances a 
given individual has of being in that column is the value given on 
the right of the equation divided by N, which would be the same 
expression with 1 as its numerator instead of N. Such fraction 
represents the probability that a given item will be in that 
column, and hence that it will have this particular x value. 
Similarly the expression at the right of Eq. (C) with 1 instead of 
N. as its numerator represents the probability that, if a given 
item is in the column, it is in a given cell in that column—has a 
given y value. Therefore the probability of an item Being in 
a given cell—having a particular ry value—is the product of 
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these two probabilities, and the number of items that fulfill these 
two conditions is N times the fraction representing the joint 
probability. So we have, as the frequency of items in any give: 
cell, ` 


[ on ela 
Dy, st ee Se Laat eae a 
a 2roayV 1 — r° 


The exponent of the e will simplify, by straightforward alge- 
braic manipulation, and yield the following form: 


eee aa a entra ere 189 
Py EFRY [a al (189) 


(Frequency in a single cell, nor- 
mal correlation surface for two 
variables) 


Suppose, now, we have our correlation surface divided into 
four quadrants, as depicted in the chart at the opening of this 
section, a, b, c, and d representing the numbers of individuals 
in the several quadrants. Then (a +b + c +d) would obvi- 
ously equal N. Equation (189) gives us the number of indi- 
viduals within a cell of a given æ value and simultaneously of a 
given y value. If we can sum for all the cells by quadrants, we 
shall have the frequencies constituting the entire population. 
Thus integrating from x = k to x = infinity and from y = h to 
y = infinity, we shall get the population in quadrant a as our 
integral (for double integration see page 38). Correspondingly 
integrating from z = k tox = — œ and from y = htoy = — o, 
we get d. Similarly we could integrate in the other quadrants 
to get b and c, while the sum of these integrals would yield the 
total population N. In this integral the only unknown term 
would be 7, and we could solve the resulting equation for r. 

But this is an extremely difficult integration to perform, and 
we shall not attempt here to follow it through. Ina long article, 
Karl Pearson" has performed the integration and has arrived at 
the following result: 


1 Pparson, Kart, “On the Correlation of Characters Not Quantitatively 
Measurable,” Trans. Roy. Soc. (London), Series A, Vol. 195, pp. 1-47, 
especially pp. 1-7. 
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she ae an 


zt = 8(h 3) e a (ht — 6h? + 3) (kt — 6k? + 3) 


“a ba) seats 


dh ae m hk(ht — 10h? + 15)(kt — 10k? + 15) 


+5040 mm (h§ — 15h4 + 45h? — 15)(k§ — 15k4 + 45k? — 15) 


hk(hë — 21h* + 105k? — 105)(k® — 21k4 + 105k? 
(Formula for tetrachoric correlation, 


— 105) + ete. dichotomic lines at k/cz and h/sy (190) 


from the respective means) 


+ 79,320 a 


The k and the A here are measured in terms of oz and cy, so 
that their values can be looked up in our table, page 481; they 
are simply the values given in the w/oz column for q representing 
the proportion of cases on either side of the dichotomic line for 
each of the two correlated distributions. So r is the only 
unknown term in the equation. In solving the equation log- 
arithms must, of course, be used to find the value of the expres- 
sion on the left,! and an approximation method must be employed 
on the right, as we shall illustrate shortly. 

Such a formula is really too complicated to be of much service 
in routine statistical work. We can simplify it by placing upon 
the situation certain restrictions. One of these is to place the 
dichotomic lines at the means (or at the medians, which is the 
same thing in the normal distributions we are obliged to assume). 
Then h and k will both equal zero, the e will have zero as its 
exponent and hence become equal to 1, certain terms will entirely 
disappear, and the previously complicated formula will simplify 
to 

2r(ad — be) _ 225r7 
N? teg + 120 3+ Bat 5,040 

1 Or the expression on the left of the equation may be put into simpler form 

as follows: 


2nr(ad — be) 
N: 2 


+e’ 


(ad — bc) x (ad — be) 
Ne Ge e) ce ew) N°zz' 


The z’s have here the conventional meaning, and their values may be looked 
up in our tables from the q’s of the two distributions. 


ett) = 
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The right-hand member of this equation can be found in a 
list_ of trigonometric series as the aresine of 7; 7.¢., it is equal to an 
angle of which the sine is r, the angle being measured in radians. 
It is likely to be found in such list in the form 


3 5 7 
SENTRET 


Therefore, by reason of the meaning of arcsine, 


(ad — be) (Formula for tetrachoric r when the 
r = sin 2r N: dichotomic lines are at the means) (191) 


The r here is 180 deg. and N is (a +b +c +d). 

As another scheme for simplification that does not compel 
equal dichotomies, Pearson, in the same article, develops certain 
empirical formulas that give approximately correct r’s, the mean 
error in 15 trials being less than 4 per cent. The simplest of 
these is the f apa 


Vai — ve (One formula for tetrachorie r, 
r= sin Z dichotomie lines not neces- (192) 


2 MEN ad + ~/be sarily at the means) 

By taking advantage of the fact that the sine of an angle equals 
the cosine of (90 deg, minus the angle), we can put this formula 
into a little simpler shape. Remember that, since r = 180 deg., 
90 deg. = 7/2. Making this substitution, 


r= eos(5—5 Vea ve) 


Combining within the parentheses, 


Stent ( be (Second ore for RU 3 
= r eee r, no restriction on position oi 1 
Vad + T) nog 


dichotomie lines) 

It is necessary to recognize the assumptions involved in the 
development of the formulas for tetrachoric r. These are-the 
following: homoscedasticity, rectilinearity of regression, normal 
distributions in both of the distributions as wholes, normal 
distributions in the individual columns, and continuous rather 
than widespread dichotomies. 

The formula for the P.E, of tetrachoric r as given by Pearson 
in the original article here drawn upon is very complicated and 
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laborious to apply, especially for the general case. We shall 
give it below, without proof, and then follow through some 
approximations Pearson proposes by way of simplifying it. 


0.6745 [ (a + d)(c + b) PAC Bout ae +b) 
Pla og | ane ae 
a- = N 
ea Co ompun a (194) 
where 
1 f aaa Bs 34 
ihe en find Bago ly 
k—rh., . h—rk 
ft HAAG ia IGT 
and 
1 1 ati ee arik) 


A 6.7 ZP 
Under certain conditions this formula greatly simplifies. If 
hand k are both zero, 61 = bz = yı = ¥2.= 0. Then everything 
within the brackets will disappear except the first term, and the 
Xo will be equal to 1/(2r 4/1 — 7°), so that we shall have 


_ 2: 6745(2ry/ 1 — r°) [ (a + oe +b) 
PE., 195 
a | o% 


(Probable error of tetrachoric r when the Tata lines are at the 
means in both arrays) 


It can be shown by a process of algebraic and trigonometric 
transformations which we shall not reproduce here that, assuming 
a=d and b=c as would be the case when the dichotomic 
lines are at the means, 


(a +d)(c +b) _ De sin-tr\? | 1 

4N? 90° 16 
Making that substitution in the formula above and changing 
the 2r to r/2 in order to compensate for the 4 placed as a multi- 


plier with the quantity for which substitution is being made, 
we have as an equivalent formula, 
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_ 0.6745 r 5 sin7 3i 196 

P.E., = i 5V1 reall ( 90° (196) 

(Alternative formula for the P.E. of tetrachoric r when the dichotomic 
t ines are at the means) 


If r equals zero but h and k have any values, then substitution 
in the general formula will show that 


0.6745 =/(a+b)(a+c)\(d+b)(d +c) 
P.E., = 197 

ziz N Nt oY) 
(Probable error of tetrachoric r when the true r is zero) 


_ 0.6745 
ae i vN V PhInPkIk 


For the important task of testing the null hypothesis, formula 
(197) is the one to use. This gives the basis for an answer 
to the question whether we might have obtained the r we have 
in hand by chance fluctuation in sampling when the true r is 
zero. This is the most frequent use we have for the standard 
error of r. Finally, if both h and k are zero and r is zero, sub- 
stitution in the general formula gives us 
__ 0.6745r (Probable error of a tetrachoric r of zero 


P.E., = when the dichotomic lines are at the (198) 
24/N means in both arrays) 


Unable to find a way of simplifying the general formula, 
Pearson resorted to the empirical scheme of combining the 
formulas for the three special cases, multiplying together formulas 
(196) and (197) and dividing by formula (198). This gave 
him an approximate formula which he found upon trial to give 
P.E.’s differing from the true ones at most by one or two units 
in the third decimal place. This formula, as the reader can 
easily verify, is as follows: 


0.6745 
P.E., = 2 
za N 
AC = r)(a + b)(a + od + bd +e) 1 (sr 
N = C) | 


0.6745 J sing! r\? 
= be te 
OSS ( )PrUP RI | ( T ) | (199) 
(Approximate general formula for the probable error of tetrachoric r) 
1 Pearson, Biometrika, Vol. 9, pp. 23, 24. 
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The z’s here have the customary meaning—the height of the 
ordinates of a normal distribution of unit area and unit standard 
deviation at the points of truncation by their respective dicho- 
tomic lines, The values of these can be found in our tables. 
The sin-! r is the aresine of r; i.e., the angle of which r is the sine. 
Probable errors of tetrachoric r’s are of the same order of magni- 
tude as those of product moment r’s of corresponding size and 
population, although the former are perhaps 40 or 50 per cent 
higher. 

As an illustration, we shall now compute the tetrachoric r 
for the hypothetical data given at the opening of this section 
about the relation of teacher success to the taking of courses in 
pedagogy. Let us say that of 135 
successful teachers 80 have had 
courses in pedagogy beyond 6 hr. and 
55 have not, while of 90 unsuccessful 
ones 20 have had such courses and 70 
have not. We shall first employ the 
complete formula, (190), in order to 
illustrate its operation and to ascer- 
tain how much the results from it 
differ in this problem from those 
obtained from the short formula. As 
a first step we drop all terms containing r beyond the second power. 
Using on the left the form given in the footnote, page 369. 


(5,600 — 1,100) p r (258)(141) 
(225%) (.395)(.386) T” 2 
582 = r + .0178r? 


Completing the square and solving this equation for r gives, as a 
first approximation, r = .577. 

Inspection of the signs ọf the additional terms on the right 
of the equation constituting the complete formula leads to the 
conclusion that this value for r is somewhat too high. Let us 
try .550, substituting this in all the terms on the right side of the 
equation. If the .550 is the correct value, we shall get on the 
right side of the equation .5823 to equal the same quantity on 
the left. But we get .5856, which is too much. Let us try 
for r .548. This gives us on the right .5832, which is still a 
trifle too high. Try .547. This yields on the right .5820. 


Fia. 26. 
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This has now passed below the value on the left but by only a 
trifling amount. The true r, correct to the third decimal place, 
is therefore .547. On pages 501 to 504 we give tables to facilitate 
these calculations. 

Let us next apply to the same data the cosin 7 formula. 


r = cos vite m = COS vao 180° 
> Vad + vbe ~/5,600 + v/1,100 
= cos 55°17’ = .569 
The probable error of this r by formula (199) is 
_ 0.6745 
(395) (.386)4/225 


G — .569*) (135) (100) (125) (90) B472°\? | _ 
& 2254 | ns ( 90° ) | mo 


The P.E. of a product moment r of the same size and popula- 
tion would be .03. 

Thus, while for r we get the value .547 by the long formula 
and .569 by the short one, a difference of .022, that difference is 
less than half of the probable error. The result from the short 
formula is, therefore, quite good enough. 

The tetrachoric formulas are especially valuable in case of 
characters not measurable in definite quantitative ways but in 
which we can make broad distinctions—such as between persons 
who have left school and those who have not left, persons who 
have been retained on a job and those who have been discharged, 
persons who are above average and those who are below average 
in success, etc. Nevertheless the formulas can be applied to 
quantitatively measurable factors by making dichotomies at 
the medians or at any other fixed points, regardless of whether 
those points correspond in the two arrays. But in this case 
the worker must not be surprised to find an r computed by the 
tetrachoric formula to differ appreciably from the one he would 
get from the same sample by the product-moment method. 
The differences between r’s computed for the same sample by 
these two methods may be as great as r’s from different samples 
by either one of the methods, or nearly as much. This is partly 
because the assumptions involved in the tetrachoric method may 
not have been fulfilled, and partly because of chance arrange- 
ments within the quadrants which affect the product moment r 


PE 
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but not the tetrachoric one. Apart from the question of non- 
fulfillment of the assumptions, it is when the population is 
relatively small that the tetrachoric r is likely to differ most 
widely from the product-moment one. 


TETRACHORIC r FROM WIDESPREAD CLASSES 


When working with the correlation methods discussed in the 
two preceding sections, it is not always convenient to deal with 
the whole distribution divided into two parts; it is often much 
more convenient to deal with widespread classes. If, for 
example, we wish to find the correlation between pedagogical 
training and efficiency in teaching, it is the most economical and 
tempting procedure to take a sampling of the best teachers and a 
sampling of the poorest ones rather than to consider all. But 
none of our conventional correlation 
formulas apply to such a situation; 
they all demand the presence of full 
distributions. For years the authors 
haye experienced the almost desper- 
ate need in research for such a for- 
mula, which would give correlation 
coefficients of the same meaning and 
value as the product moment r’s and 
yet permit operation only with the 
extreme tails of one of the distri- 
butions. To meet this need we have developed the following 
formulas. 

For tetrachoric correlation from widespread classes we have 
a situation like that represented in the accompanying diagram, 
our individuals located in quadrants but an open gap between 
lines K,K, and K:Kə. We-may integrate for the correlation 
surface from kz to infinity and from A to infinity to get a. This 
integral will not differ at all from what it would be if the whole 
surface were present, because the integration is from the dividing 
line outward and not across the vacant middle. When put in 
terms of proportion of the entire population by dividing by N, 
this integral will be 


Via. 27. 


(B) 7 = zl, if ee dx dy + zers 


376 STATISTICAL PROCEDURES 


Since the e under the double integral sign has as exponent 
the sum of two terms, it may be separated into the product of 
two integrals, so that Eq. (Z) may be written as follows: 


(F) F = PAR Fi Fa] [as ie foe Fa] + zens 

Now examination will show that the first of these integrals is 
the summation of the z’s (and, consequently, of the probabilities) 
in the x distribution from ks to infinity and hence the tail of the 
a distribution from kez to infinity, which we shall call Pr. The 
second integral is, similarly, the tail of the y distribution, which 
we shall call pı. Making these substitutions, and making some 
algebraic transformations, we have 


G Giese . (a/ N) = PPr _ 
(E) N T PPh + enS; BTA 8 

The § is the slowly converging series given at the right in 
formula (190) and which we here restate in abridged form, 


a — Npkpn „p= whe —1) 


Nexen 


=r¢nkt 


+r „itge -a0 — 9) t's 

The N of this formula we are likely not to know except by impli- 
cation, But we do know the number of individuals in the four 
quadrants so far as the remaining tails include them, and we must 
know the proportion of the whole population included in each 


tail. We may, therefore, put N in terms of these directly known 
elements as follows: 


(a +c) = PrN; (b +d) = PN; so (a +b +c +d) 


= N(pin + Din 
Therefore, ero 


~4+tb+e+d 
N Pra + Pri 


Making this substitution in formula (G) and letting 
(atb+e+d=n 


instead of N which stands for the whole population of the 
unmutilated distribution, we have 
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(Pix + Pix) — NPiPh _ gy (200) 
NZkZh 

Similarly integrating for d from z = kı to s = —© and from 
y = htoy = —%, we have 

Toe E pe) LAPAN a g, (201) 
The qa at the right is used in the ordinary sense: it is (1 — p). 
Note that cognizance must be taken of the sign of h and of k, so 
that S of formula (200) does not have the same value as Sı of 
formula (201). It is advisable to take two determinations of r, 
one from formula (200) and one from formula (201), and accept 
as the true value of r the geometric mean of the two. However, 
if the assumptions of the development have been fully met, the 
two values will be identical. 

Suppose, now, that the two tails of the x distribution are equal 
and that the dichotomy in the y distribution is at the mean. 
Then kı will equal ke, h will be zero, pa will be $, 2} will be 1//2r, 
and S and S, will be identical in value. We may then advan- 
tageously combine formulas (200) and (201) and have 


J (as oe (Ce DiI) _ oy 
NZ 


Nk 22%, 22%, 


EENE NOI ee 
ys (PREG alla ane 

Tf our assumptions have been fully met that the tails are equal 
and the dichotomic line dividing the y distribution is at the mean, 
a would equal d and b would equal c. But, for reasons of unequal 
sampling at the two ends, if for no others, this will seldom be 
precisely the case in empirical samples. Let us, therefore, take 
vad for each a and d, 2v/be for (b + c), and 2+/ad for (a + d). 


Then we would have 


piv 2a ee -Hatd+b E -s 


Zk a+d+b+c 
LE EE 
Zk 2/ad + 2v/bc 


(1) pin 2r F vad fa vbe gi 
2a ad + Vbe 
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We shall now substitute the value of S’ for the special condition 
where the dichotomy is at the mean of the y distribution, and 
hence where h = 0, and have the following formula: 


pya Wad Ve FG 1) 
a Vad + vie 
5 
+ fa (Ht — Gk +3) — ue (kS — 15k4 + 45k? — 15) 


tiz A (kë — 28k° -+ 210% — 420k? + 105) 
Fr az (k19 — 45k% + 63048 — 3,150K4 + 4,725k — 945) 
+ 599.040 o gag E — 66K! + 1,485k° — 13,8604" + 51,975k' 


— 2 ma; ESANS IA pah 12 10 
62,3704? + 10,395) — ggg Č“ — 91k! + 3,003 

— 45,045k® + 315,315k° — 945,945! + 945;945k? — 135,135) 
ATENE i4 LERS, 10 

+ rsarapan E" — 120k + 5,460! — 120,120% 

+ 1,351,350% — 7,567,560k° + 18,918,900% — 16,216,200k? 


(Tetrachoric ooefficient of correlation 
when one series is divided at the 
+ 2,027,025) — - + - mean and only symmetrical tails (202) 
remain from the other distribution, 
the entries being in frequencies) 


The series continues infinitely. We have given it at this 
length to indicate its behavior as terms are added, but seldom 
will the practical worker have occasion to use more than the 
first three terms on the right. In fact we shall give below a 
scheme that requires the practical worker to use only the first 
power of r and permits him then to ascertain the value for the 
whole formula from our tables. Obviously the series will con- 
verge rapidly for low 7’s, but for high r’s it converges extremely 
slowly. In order to get reasonable convergence for 7’s of .95 to 
.99, several times as many terms must be employed as we have 
given. In making our tables in the Appendix we were obliged to 
carry the formula to the hundredth power of r in order to handle 
the high 7’s. 

The & is found in the table for any proportion we wish to 
retain in each of the tails, this proportion being the p of our 
formula. The & is the distance from the mean of a normal 

\ 
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distribution of unit area and unit standard deviation to the 
ordinate that cuts off the required tail, and it is labeled x in the 
terminology of our tables. If a tail of 15.87 per cent (practically 
16 per cent) is employed, k will equal 1, and hence the second 
term on the right of the equation will vanish leaving the first 
power of r and powers beginning with 5. Unless 7 is rather high, 
these powers of r of 5 or greater may safely be neglected. 

Although we shall offer below a short-cut method, we shall 
illustrate the operation of this formula for the sake of making 
the principle clear. Out of a total student body of 1,257 mem- 
bers! 156 of approximately the best 16 per cent in scholarship 
had accomplishment quotients above the average, while 54 had 
AQ’s below average. On the contrary, of the poorest approxi- 
mately 16 per cent in scholarship 31 had AQ’s above average and 
184 below. In this instance, contrary to what would ordinarily 
be the case, we know the precise percent- 
age in the tails, because we measured the 
whole population for another purpose; it 
is 16.9 per cent. But extremely small 
errors are involved in the resulting r from 
slight diserepancies in estimating the per- 
centage in the tails under conditions where 
that percentage cannot be precisely meas- 
ured. The k for 16.9 per cent is 0.9581, 
which is so nearly 1 that no appreciable 
error would be involved in taking it as 1 for the sake of simpli- 
fying the arithmetic. For the first approximation we shall use 
only the first power of r. 


_ .1694/2r (4/184 - 156 — »/54- 31 yas 518 
2(0.252) (4/184 - 156 ++/54-31) ` 


This is the value of the quantity on the left side of our equation, 
formula (202). The total value on the right must exactly bal- 
ance this when we take account of all the terms. In order to 
compensate for the additional terms, we would need to carry on a 
process of approximation precisely similar to that illustrated 
on page 373. But, when we try substituting .514 for r in the 
first five terms of the equation on the right, we obtain for the 


1 Perers, C. C., “A Method for Computing Accomplishment Quotients 
on the High School and College Levels,” J. Educ. Res., Vol. 14, pp. 99-111. 
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value of this member of the equation .5138, while substituting 
.513 gives us .5128; so that .513 is the closest we can get with 
three decimal places, and our first r required no correction. 
The Pearson product moment r, computed from the whole 
population, is .511, which is in remarkably close agreement with 
that given by our formula, 

We have employed here a problem resulting in a rather low r, 
hence the series converged rapidly, so that we obtained a precise 
determination by using only the first few terms. But unfortu- 
nately the series converges very slowly for high values of r, 
so that it may be necessary to substitute in the formula through 
so many terms as to become impractical. We have, therefore, 
provided tables (pages 505 to 507) for reading values of the r’s 
for the completed formula from those 
obtained by solving the equation for 
only the first power of r. The use of 
these makes wholly unnecessary the 
method of successive approximations 
d c just illustrated and makes the computa- 
95 0 tion of any r an extremely simple process. 
Suppose, for example, we have the 
fourfold layout indicated on the right, 
the p for each tail being 12 per cent. 
Dropping all terms in r except the first, we have 


ike: 12y/2r (4/90 - 95 — v8 - 10) eh 
2(.20) (4/90 - 95 + +/8 - 10) 

Entering Table L with p = 12 per cent and following down 
this column, we find .6185 as the nearest value. Interpolating, 
we get as the corresponding true r, correct to 3 places, .651. 
For r’s below .25 no correction is needed; unless extremely high 
accuracy is required, the value obtained from solving for only 
the first power may be taken as the true r. But for r’s above 
-50 or .60 (depending much upon the percentage in the tail) 
the correction is important, and the more so as the 7’s approach 
unity. 

Two Pennsylvania State College graduate students made 
empirical tests of this tetrachoric formula. C. E. Amos made 
78 determinations of the tetrachoric r’s in accordance with the 
conditions of the formula, and of the corresponding product 


Fie. 29. 
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moment r’s, with populations of from 128 to 802. The average 
deviation of the tetrachoric r’s from the product-moment ones 
was about 7 per cent. The most extreme divergence was five 
product-moment P.E.’s; but prevailingly the deviations stayed 
within one P.E. R. W. Jacks tried departures from the condi- 
tions of the y dichotomy at the mean and from equality of propor- 
tions in the tails and nevertheless used the formula that assumes 
these conditions in order to see how far we may depart from the 
assumptions and yet get satisfactory results. He took as p 
the geometric mean of the two p’s and as the z the geometric 
mean of the two involved in his data. Out of 87 determinations, 
he found no substantially greater error than that by Amos when 
imposing the conditions assumed in the development. We may, 
therefore, depart somewhat from the assumptions of our develop- 
ment and yet have substantially correct results. Our formula 


would then be ` 
Væ pp: Vad = Vde _ g (203) 
QV int Vad + v/be 


where S’ is the value given at the right of the equality sign in 
formula (202). 

But if the dichotomy in the y distribution differs more than 
moderately from the mean and reasonably accurate results are 
required, it is best to use formulas (200) and (201) rather than 
(203), In that case the formula must be solved for two quad- 
rants by the method of approximation illustrated on page 379. 
But in order to lessen the labor of such calculations, we have 
set up values for the quantities in parentheses for all tails from 
5 per cent to 40 per cent and up to the 25th power of r. These 
tables are found in the Appendix, pages 501 to 504. Suppose 
in some particular problem pz: is 25 per cent and pais 33 per cent. 
The tables need make no distinction between pa and pr. We look 
in the table for the coefficient as determined by the p and at 
another place in the table for the coefficient as determined by pa. 
In the case of our illustration these would run as follows: 


r + (0.4770) (0.3111)r? + (—0.2225)(—0.3292)r? 
+ (—0.3504)(—0.2520)ré 


But one must carefully observe that if either the h or the k is 
negative, the sign of its part of the coefficient is reversed as 
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compared with the signs in the table in all even powers of r 
but not in the odd powers. If the A and the k in the quadrant 
with which we are operating lie in the same direction from 
their respective means, the sign of the combination is plus; other- 
wise it is minus. But a little experience will show that the r 
must reach at least .16 before carrying the r’s beyond the first 
power will affect the third decimal place and beyond .20 before 
the second decimal place will be affected. When the r reaches 
.80 or .90, the equation must be solved through a considerable 
number of terms, 


P.E. OF TETRACHORIC r FROM WIDESPREAD CLASSES 


We have not yet developed a satisfactory general formula 
for the probable error of tetrachoric r from widespread classes, 
but we submit tentatively the following one. It is for 7’, the 
value when all but first powers of r are dropped from formula 
(202). But r’ may be regarded as r(1 — j), where (1 — J) is 
the value obtained by dividing the series at the right by r. 
Thus, since r is a function of 7’, the P.E. of the latter is a proper 
basis for inferring the P.E. of the former. 

Since we assume the dichotomy of the y distribution to be at 
the mean and only symmetrical tails remaining in the x distribu- 
tion, we can put formula (202) into the following form: 

\ 


where a = (a + d)/nandg = (b + c)/n, then being the number 
of individuals in the two tails combined. Then 


ar’ = la — P) oes 


V 2r. 
The £ Die is a constant. Let m represent it. Then 


dr’ = mòla — B) 
Squaring, summing, and dividing by the number of samples, 


Zor _ na es 4 258g ic) 


Ss 5 S 
Therefore, 


oF = m*(o%. + 0% — 2ragoacs) 
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It is obvious that rag = —1, for the population remaining in 
the tails is divided into only those two parts, so that as the one 
increases the other must decrease. Therefore X 


o = m?(o?a + 0% + 2oucp) 


But (œ +8) =1. Applying, therefore, the formula for the stand- 
ard error of a proportion: ce = ~/a8/n. Similarly og = VoB/n. 
Thus, since ca is found to be equal to gg, we have, by substituting 
in our last o} formula above, o}? = 4mo?.. Replacing m with 
its equivalent and taking the square root, 


pvr faß 
W) ay =i PYRE [ee 


We may now replace a and 8 with their values in numbers of 
individuals, and, by substituting these in (J), we have 


pv ie fat aoe) Chonn kom wile. (204) 
PEVA n n? spread classes) 
PPO 0.6745p/2%r |(a + d)(b +c) 


z/n n? 


By some algebraic manipulation based on the fact that (a + 8) 
equals 1, we can put formula (204) also into the following form, 
equivalent to (204) except to the extent to which the assumption 
of symmetry in the tails and division of the y distribution at 
the mean is violated: 


0.6745 2 
P.E. = TE a 204a 
$e A aa ee (204a) 


By formula (204) the P.E. of the 7’ of .513 computed on page 379 
would be 


169-+/2n 0.6745 [156 + 18481 +54) _ goo 
. =i 
0.2524/425 425 


By formula (204a) this P.E. would be 


P.E.» = 


PE, = —0:8745 169 _ arg: Oba = L021 


~ 0.2524/425 V2 
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The above are formulas for the P.E. of r’. But r is a function 
of r’, so that we may find the P.E., directly from P.E.y. We 
may symbolize the procedure in the following formula: 


P.E., = 4’ + P.E») — f(r — P.E.,)] (205) 


In verbal directions this means that we look up in our tables in 
the Appendix the r corresponding to (7’ plus its P.E.) and again 
the r corresponding to (r’ minus its P.E.) and take for P.E., 
half the difference between these two. For any multiple a 
of one P.E. we would need to take 1/2a times the difference 
between the r function of (r + a: P.E.y) and of (r’ — a: P.E.,). 

In the case of an 7” as low as the one of .513 of our illustrative 
problem there will be a negligible difference between P.E., and 
P.E.y. But we shall make the translation anyway for the sake 
of illustrating the procedure. 

The r corresponding to an 7” of (.513 + .022) for a tail of .169 
is .535, while that for (.518 — .022) is .491. Therefore 


P.E., = 4(.535 — .491) = .022 


BISERIAL r FROM WIDESPREAD CLASSES 

We shall now develop a formula for biserial r from widespread 
classes to match the one just presented for tetrachoric correla- 
tion, There are many situations in which such formula may be 
extremely useful in educational and sociological research; for 
example, where we wish to investigate the relation of teaching 
success to certain measured personality traits, or of attendance 

at movies to conduct outcomes, 
or of marital success to certain 
measurable factors, where we 
zz can have distributions of scores 

Pa on the factors we wish to corre- 

late with our criterion but where 

it is feasible to investigate in 
detail only those extremes of the whole population which are 
outstandingly “high” in the eriterion trait and those which are 
outstandingly “low.” 

Referring to the diagram, let p be the proportion in one of the 
tails and pz the proportion in the other tail. Then, using the 
same graphical relations as those employed at the opening of 
this chapter, 


Fia. 30, 
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pe a E E Mz — M: — M: — Mi)pipa 
ñ 2 Ei tH (z/pi) + (22/p2) pei + pizz 
Oz (Ma = M)pıp 1 - a fag M )pıp2 
ices Oy Pi + piee oy oy(poe1 + pian op) 


(Biserial A of correlation from widespread classes) 


If the two tails are equal, as will often be the case, this formula 
reduces to 
(M: — M,)p (Biserial coefficient of correlation from 


r = >; widespread classes in case of sym- (207) 
2zoy metrical tails) 


The only awkwardness involved in the use of this formula is 
connected with oy. This must be the standard deviation of the 
whole y distribution, unmutilated by chopping out the middle 
of the x distribution. This e may be calculated from a sample 
of the whole population, or it may be inferred theoretically 
from the remaining portions. The sample may be reasonably 
small compared with the whole population from which the tails 
are taken, since the standard error of a standard deviation is 
relatively low and since slight changes in the sigma do not 
greatly affect biserial r. For those situations in which it is not 
practicable to obtain this ¢ from the whole population, or from a 
sample of the whole population, we developed a formula for 
inferring it from the remaining tails. The complete derivation. 
is given in the previous edition of this book. The proof is some- 
what lengthy, and for that reason we omit it here, referring the 
interested reader to pages 279 to 282 of the lithoprinted edition. 
The formula which eventuates is 


a_l 
v (m1 + na) 


Mı +My 


i 
ar [an tan +0(2-2 -pipen 


Po. (pi + pa) 


(The standard deviation of a total distribution taken from tails 
remaining after mutilation when the moments are in score form) (208) 


where Y; is a score in the lower tail of the distribution and Mı 
is the mean of the scores in that tail, while z and x have the 
customary meanings in a normal distribution of unit area and 
unit standard deviation. Note that the z’s in formula (208) are 
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unbarred and refer to the distances from the mean’ of the whole 
distribution to the points of truncation. 

In solving a practical problem, we must first find the b for our 
particular problem, for which the value, as indicated on page 
385 is 
(Mz — Mi)pyp2 


site pei + pita 


Then, armed with this information of the value of b, we deter- 
mine the value of oy by the use of formula (208) or one of its 
variants. Finally we use formula (206) to find biserial r. The 
r will merely be the b divided by øy. Remember that xı will 
customarily have in its own right the minus sign, which will 
be offset by the minus in the term —221 of Eq. (208) and similar 
expressions above. Thus, if neither tail is greater than 50 per 
cent, both final zz products will be positive. Unless the worker 
watches his step he may get confused on this point. 

Besides normality in the total distribution of the population 
in both variables, this formula assumes sharp truncation of the 
tails, Such sharp truncation is possible if we have actual 
measurements in the criterion factor. But sometimes we must 
merely estimate whether or not individuals belong in the cate- 
gories with which we are working, in which case the reliability 
of the r is lowered. This is merely because of the unreliability 
of measurement in the criterion factor and affects our problem 
in the same manner as unreliability of measurement always 
affects correlation. 

We shall now use as a practical example of this procedure 
the same study employed in our section on tetrachoric correlation 
—correlation between grade-point averages and accomplishment 
quotients for 1,257 college students. We shall display the muti- 
lated correlation table shown on page 387, all that middle section 
of students who made grade-point averages between 1.00 and 1.99 
being eliminated from consideration. 

Working with these data with intervals as units above the 
mid-point of interval .30-.39 as assumed mean instead of with raw 
scores, we find for the 294 students (23.4 per cent) who had made 
grade-point averages of 2.00 or better a mean of 7.846 and ZY} 
of 19,365, while the 215 students (17.1 per cent) who had made 
grade-point averages below 1.00 had average AQ’s of 4.340 and 
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TaBLE XXXII.—AccomrPiisuMent Quotrents oF “Goop” And “Poor” 
COLLEGE STUDENTS 


Average grade points 
ne mm a 
Below 1.00 | 2.00 and above 
2.00-2.09 2 
1.90-1.99 
1.80-1.89 2 
1.70-1.79 3 
1.60-1.69 3 
1.50-1.59 1 7 
1.40-1.49 1 9 
1,30-1.39 23 
1.20-1.29 3 34 
1.10-1.19 8 67 
1.00-1.09 18 70 
0.90-0.99 8 50 
0.80-0.89 50 21 
0.70-0.79 59 3 
0.60-0.69 37 
0.50-0.59 21 
0.40-0.49 6 
0.30-0.39 3 
Totals... .. 215 294 


ZY% of 4,807. Our first task is to find b and then cy from these 
data. 


p = (Ma — Mi)pıpa _ (7.846 — 4.340)(.171)(.234) L 4 254 


pazı F pizz  (234)(0.2540) + (.171)(0.3066) 
4,807 + 19,365 + 


509 0.2540 0.3066) |? 
ti nae [7846 + 4.340 + 1.254 (0280 - oze] 
= 509 ii 
es aen (294) (7.846) { 4.340 + 7.846 
2540  .3066 
giis (st ~ "234 )} 


1.2542 f 
— Sg [O-2540) (0.9502) + (0.3068)(0.7257) = 2,283 
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In this case we happen to know the mean, the standard 
deviation, and the r as computed from the whole population, 
and we went the roundabout way of inferring these merely for 
the sake of illustrating the procedure one would need to follow 
if he were not fortunate enough to know the ø. The correct ø is 
2.301 (in intervals) instead of the 2.283 we obtained by inference, 
and the correct r is .533 when corrected for broad categories as 
compared with our .549. Inspection of the complete correlation 
chart reveals that the regression is slightly curvilinear and 
this violation of the assumptions back of the formulas threw 
us off a little. But the discrepancy is less than one standard 
error, and this is no greater than could be expected from r’s 
computed from different samples. In fact it must be expected 
that the extremes of distributions will differ in relation to their 
own distributions as wholes in somewhat the same manner as 
the regression lines of successive samples differ from one another, 
so that 7’s computed from widespread classes may be expected 
to diverge from corresponding ones computed by the product- 
moment method to a degree comparable with the fluctuation of 
7s by either method from sample to sample. But the average 
of the 7’s from a number of samples may be expected to be the 
same by both types of method, and in any one sample the 
chances are just as good of approaching the true r closely by the 
methods of these last few sections as by the product-moment 
method—or perhaps a little better since the P.E.’s of the 
former r’s are smaller for the same number of actually utilized 
individuals. 

Formula (208), where we must infer both the mean of the 
distribution and its standard deviation from a few fragments, 
requires considerable penciling, as the reader had probably 
noticed to his horror. - Of course, no one in his senses would 
use that method if his data permitted him to compute a regular 
Pearson product moment r. But the labor of penciling in the 
application of this formula is trifling compared with that which 
might be involved in testing, or otherwise investigating, the great 
middle bulk of the distribution. 

The foregoing example was intended to show how to infer the 
standard deviation when it is unknown and then to compute the 
biserial r. Following is a typical example of the many potential 
uses of this formula in practical research. In it theo of the whole 
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population is known. After presenting it, we shall derive 
formulas for the standard error of this type of r. In order to 
test the validity of the Bernreuter Personality Inventory, Krupa! 
gave to 450 freshmen at Pennsylvania State College a verbal 
description of a neurotic person and a verbal description of a 
stable person (in addition to similar treatment of other traits). 
He asked these freshmen to write the names of freshmen of their 
acquaintance who were much like the persons therein described. 
Twenty-one subjects were named as neurotic three or more times, 
and 39 as stable. All the freshmen had previously marked 
the Bernreuter Personality Inventory. Thus Krupa had the 
distribution of the neuroticism scores for the 4.66 per cent of his 
subjects who most impressed their fellow students as neurotic, 
and also the distribution of these B1N scores for the 8.67 per cent 
who impressed their fellow students as most stable; and he knew 
the standard deviation of these B1N scores for the whole 450 
students. That the neuroticism mean of the scores for the former 
group was higher than that for the latter showed some validity 
in the Bernreuter inventory. To put the extent of that validity 
into standard correlation language, Krupa applied formula (206) 
as follows: 


mes (M: = M;)pip2 
(pizz + prti)oy 
“as (68.19 — 31.00) (.0867) (.0466) = 316 
= [(0466) (0.1578) + (.0867)(0.0977)]30.06 ` 


We need, now, a standard error for this r. The standard 
error when the true r is zero will serve a useful purpose, because 
it will enable us to test whether fluctuations of sampling could 
be expected ever to yield so large an r as the one obtained if the 
true r were zero. The derivation is easy. The formula for r 
can be expressed in two parts as follows: 


a Pip2 S ah 
te = toa pes (M: — M;) 


Except for a very slight sampling fluctuation of øy which we 
shall ignore, the first of these parts is a constant in a universe of 
samples from a certain population, so that the standard error is 


1 Unpublished master’s thesis at Pennsylvania State College, 1939. 
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merely this constant times the standard error of the difference 
between means assuming the null hypothesis. This [formula 
(101), page 178] is 


1 NELA 
= ő =ğ 


nı 


Lg [ps +p 
PaPe N 


The @ is that of either class in the tails of the distribution. But, 
since the correlation is assumed to be zero and hence the slope 
of the regression line is zero, this ø will be the same for both 
tails and the same for all columns in the complete correlation 
table, and hence the same as the y of the whole distribution. 
Taking the product of the two parts and taking account of the 
relation between o and @, we have the following: 


Pann es 
OM:M, FO Na + 


= iy! Vipa et (pi + pa) 209 
Orhin (piza + pz) yN -i V (pi +p) (209) 


(The standard error of biserial r 
from widespread classes when 
the true r is zero) 
Substituting in this formula the values from Krupa’s study, the 
standard error of his r is 


+/(.0867) (.0466) 
(0466) (0.1578) + (.0867)(0.0977)/450 — 1 


»/.0867 + .0466 
= .069 


The obtained r is 4.6 times its standard error; so the probabil- 
ity that it could have arisen by chance fluctuation when the true 
r is zero is extremely slight. 

For many purposes, including the one typified here, we need 
also the general formula for the standard error of this r. Both 
its derivation and its application are more complicated. It fol- 
lows along the same general lines, but we cannot assume that 
the sigmas are eqtial as we could in the case above. The neces- 
sary standard deviations can be put in terms of known parameters 
and then put through a series of algebraic transformations, result- 
ing in the following:! 

1 When the two tails come together so as to make a continuous distribu- 
tion, our formula (209) reduces to Soper’s, (188), where the true r of zero is 


—_—— 


FURTHER METHODS OF CORRELATION 391 
side 
V pipe E PŽ | patz 2) 
VPP A (pi +p 2) — 72 (Pat 4 Pty 
(pie + pz) VN (aeza pi P3 Pı P2 


(General formula for the standard error (209a) 
of biserial r from widespread classes) 


For Krupa’s problem this works out as follows: 


~/ (0867) (.0466) 
[(.0466) (0.1578) + (.0867)(0.0977)]v/450 


, [ C0867) (0.0977)? , (,0466) (0.1578)? 
3ie| (0466)? + (0867)? 


(.0867) (1.6781) (0.0977) _ (0486) 7-5516)(0.1878) | i 068 
(.0466) (0867) : 


Since at this point r’s do not distribute themselves normally 
about a true value, the type of interpretation employed on 
pages 135 to 139 is not strictly applicable; but its use does not 
distort the meaning too much for practical purposes. Employ- 
ing that type of interpretation here, the odds are about two to 
one that the true validity correlation coefficient in relation to 
this kind of criterion lies somewhere between .316 minus .066 
and .316 plus .066; i.e., between .250 and .382. 

It is worth noting that the standard error of a product moment 
r with the same actually employed population (¢.e., 60) would be 
-116, which is nearly twice as high as the .066 of our biserial 
r from widespread classes; and that of a biserial r with a con- 
tinuous population of size 60 divided at the middle would be 
-149. This suggests the economy of working with widespread 
classes where feasible. 


| cosse + -0867) 


MEAN-SQUARE CONTINGENCY CORRELATION 
Another form of correlation which has been used to some 
extent is one based on x?. On pages 414 to 418 we show how 
to compute x? from a contingency table. Our interest there is in 


substituted in his. For a continuous distribution, formula (209a) gives 
results very close to Soper’s but does not reduce algebraically to his. That 
is because both Soper’s formula and ours make certain assumptions and 
approximations, though not the same ones. Because of the space required 
for the printing of the complete derivation of our formula, we cannot publish 
it here. The derivation is given in an article by the senior author (Peters) 
in Psychometrika for August, 1941. 


392 STATISTICAL PROCEDURES 


determining whether chance fluctuations alone could explain the 
relations which appear to exist in the table; t.e., to test the null 
hypothesis. But a positive measure of correlation can be 
based on these same calculations, getting a measure which 
is designated by C and called the mean-square contingency 
correlation: 


XA Mean-square contingency 
Ce NE +N ‘ coefficient) (210) 


This measure of correlation is particularly fitting for materials 
which do not lend themselves to arrangement in categories 
that can be certainly said to be quantitatively ordered, or where 
the distances between the intervals are not susceptible of definite 
quantification. C varies between 0 and 1. But it does not, 
in itself, indicate the sign or the character of the regression. 
This must be determined by inspection. Karl Pearson showed 
that, if the items are capable of interpretation as a quantitatively 
ordered series, if the distributions are normal, and if the regres- 
sion is rectilinear, C becomes identical with r as the number of 
categories is indefinitely increased. But, since these assump- 
tions are usually so far from fulfillment and since C is fairly 
laborious to compute compared with other forms of correlation, 
we do not regard it as a particularly useful form. Tetrachoric r, 
or the form we shall discuss later in this chapter, will ordinarily 
serve the same purpose better. 

On pages 415 and 416 some data by Burgess and Cottrell on 
marriage adjustment in relation to level of education are used to 
illustrate application of the x? technique. We shall here draw 
upon the explanation and the calculation of x? which is given 
there. For this problem x? = 36.9, and N is 513. Hence 


sd POS PURREN 
C= we = V309 4513 — 2 


The formula which Pearson gives for the standard error of C 
is somewhat laborious to apply, and we are not explaining it here. 
The interested reader will find it in Kelley’s Statistical Method, 
page 369. 

The maximum size of C is limited by the number of categories 
into which the distributions are divided. It can easily be shown! 


1See the lithoprinted edition of this book, p. 288, 
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that the maximum value C can have when computed from a 
square contingency table with ¢ rows and ¢ columns is 


Nae 


Thus, even though we know the correlation is perfect, our formula 
cannot give us a higher correlation than y/(t — 1)/t, which is 
.866 in the case of a 4-category table. Even in a table with 
15 categories in each array the maximum correlation can be 
shown, by substituting 15 for t, to be only .966. In our next 
section we shall show how to correct for this, at least in part. 


CORRECTING COEFFICIENTS OF CORRELATION 
FOR BROAD CATEGORIES 


The coefficient of mean-square contingency is not the only 
coefficient of correlation that is left too low when computed 
from a table with broad intervals. The same thing is true of all 
correlations. It is true of every product moment r calculated 
from a correlation chart as compared with the r calculated from 
the paired scores. A correction is really called for in every r 
calculated from a correlation table where the scores are grouped 
in intervals wider than one unit, and it is imperatively called for 
whenever the number of categories is at all small—say below 
ten. We shall, therefore, attack in general terms the problem 
of correcting a coefficient of correlation for broad categories, then 
return to the application of the technique to the C of the above 
section. “ 

We shall first treat the case in which our classes are taken as 
centered about the means of their respective intervals; then 
afterward we shall take up the case, familiar to us in customary - 
correlation work like that discussed in our Chap. IV, in which 
the items within an interval are regarded as centered about the 
mid-point of the interval. 

Let the unprimed letters stand for values centered around the 
means of their intervals while the primed letters stand for the 
variates themselves. Then rz, is the correlation in terms of 
intervals, while rzy is the correlation when all the variates are 
taken at their actual values. We want to find ryy in terms of 
Teu. We shall employ in simple form the technique of partial 


394 STATISTICAL PROCEDURES 


correlation treated in a preceding chapter. On page 243 the 
reader will find the formula 
roa = Tor — Tosi: 
V1 —thv1 — 
We shall let the 0 be x, the 1 be y, and the 2 be z’. Then 


Vey — VeeT ye! 
“WI — V1 — y 
The ray.» is the correlation between x and y with the z’ variates 
held constant. But, if the z’ values are held constant, the 2’s 
would be constant, since the z’s are the means of the 2’’s by 
intervals, Because any variable correlated with a constant 
gives a zero correlation, fay. equals zero. So we have 


— TasTyy 


wr - yr HY Lm my 
Multiplying through by (V/I — riy vT — 722), we get 


Tay —TaTy = 0 
and transpoging, 
Tey = TesT ye 
We should like to be rid of the ryz, so we shall try another 
partial correlation. 
Tit 
oR pEi erde 


This partial r is the correlation between the y's and the z”s 
with y held constant. But if the y' is constant, the y must be 
constant for the same reason as given under the z’s. So our 
. partial correlation would again be equal to zero. Setting the 
fraction on the right equal to zero, clearing of fractions, and 
transposing as above, rey = reyTyy. 
We shall now substitute this value for rey in the equation 
second above where we said we wished to get rid of it. After 
making this substitution, we have, 


Tay = Tart tytyy 


What we started out to find was the correlation between the 
variates taken at their true values in terms of the correlation 
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when the values were taken as centered about the means of 
intervals. That is, we want a value for ryy. We can easily 
solve for it the equation just given, getting 

r, (Formula for correcting an r for broad 


Ty = categories when calculated in terms (211) 
Vad yf of means of intervals) 


The correlation between the variates is, therefore, the correlation 
in terms of means divided by the product of the r’s between means 
and variates in each of the two arrays. 

But the catch is that, except in cases where we can draw 
upon ready-made tables, we are not likely to know the coefficient 
of correlation between means and variates within each of the 
distributions. We must meet this difficulty by next getting a 
formula for the r’s between the means and the variates of a 
distribution. 

If we assume rectilinearity of regression between means and 
variates, the value of a variate computed from the mean of its 
interval is, as we learned when we studied the simple regression 


equation in deviation form, # = teat = a, But the 2 is the one 
a 


that lies on the regression line, and, assuming rectilincarity of 
regression as said above, it is at the mean of its column, There- 
fore ž' is the same as z, which is by notation the value of the mean, 
Dividing through therefore by this value, we got rar = (0/0). 
In a precisely similar manner it can be shown that 


tof lation bet 
Se, Porm for to eesti oriana Gis) 
If now we substitute these values for the r’s in formula (211), 
we get another formula for correcting for broad categories that 
is simpler for many applications. 


tw = 


juivalent formula 
or corrected for 


es Tay = "0y —brond categories in 213 
ree = Ta) Galea) T ay same computed in (218) 
intervals) 


If we are working with distributions of unit area and unit 
standard deviation (as we are when we deal with proportions 
in the several intervals and apply z and z values from our tables 
pages 481 to 484) and if we assume normality of distribution as 
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it is assumed in those tables, both the ov and the cy will be 1, by 
definition. Then our formula will simplify to 


(Formula for r corrected for broad 
categories in case of measures 


Tey = Tan Zun centered about means of intervals (213a) 
Oz0y Naito} and a normal distribution of unit 


area and unit standard deviation) 


This does not apply to the correction of 7’s computed from 
such correlation tables as we dealt with in Chap. IV and as we 
employ in most product-moment work, because there the items 
are taken to be centered about the mid-points of the intervals 
(called index values) rather than about the means of intervals. 
A reasonably satisfactory correction can be made for this case, 
but not so neatly as for the case of means. 

Let 2; be the deviation of the mid-point score value of an 
interval in the x array, and let x be a score in deviation form. 
Let y: and y have similar meanings in the other array; and let 
Ca be the difference between an z and the corresponding index 
value and c, have a corresponding meaning in the other array. 
Then 


L Elti + co) (yi + Cy) _ Beyi + Bacy + Byc + Zloty 
No (rites) tuiten) No (ait-c2) 7 itev) 


Toy! 


Some c,’s will be positive and some negative. If there were 
nearly perfect correlation, not only would the c,’s fall in the 
same interval as their corresponding paired y values but would 
also tend to fall on the same side of the mid-point of the interval 
to which they belonged in the other array; thus there would 
tend to be an excess of like-signed z,c, values. But that would 
happen only in extremely high correlations. Under all other 
conditions they would fall largely at random on either side of 
the mid-point and thus have random plus or minus signs; or 
they might even fall into other intervals in the y array from 
the corresponding one in the x array. Hence, over the whole 
distribution the y,c, products would tend to sum to zero. The 
same thing would be true of the z,c, products and the ¢zcy 
products. Hence for an approximation which holds well for 
all except extremely high 7’s, the numerator needs no correction. 
The o’s in the denominator are standard deviations of distribu- 
tions in terms of index values, and their correction calls merely 


FURTHER METHODS OF CORRELATION 397 


for Sheppard’s correction, treated on pages 84 to 89. There- 
fore, we have 
aS Dry 

NV — (1/12) V — (1/12) 


(r corrected for broad 
fee categoriesinthecase (214) 


ë; 
™ Va — (1/12) V/a} — (1/12) of index values) 

Thus we can either incorporate the needed correction in 
the original computations by making Sheppard’s correction in the 
standard deviations or we can correct an r computed from broad 
categories by dividing it by the product of the two ratios 


Voz — (1/12) v = (1/12) 


Oz Oy 


Tay 


where the o’s are the ones computed in terms of index values. 
It is these ratios which are tabled as the fourth and fifth columns 
in Table XX XIII, page 398. This formula is extremely impor- 
tant, since it fits the situation in which we customarily compute 
coefficients of correlation. It is so easy to make Sheppard’s 
correction (merely subtract one-twelfth before taking the square 
root, or such modifications of this as are explained on page 73) 
that it would be a good practice to make it in all correlations 
computed from correlation tables. Certainly it is imperative 
to make it if the number of intervals is as few as 5 or 6 and better 
to do it with anything less than 12 or 14. 

One more case of interest is the special one where we may 
assume a rectangular distribution. As shown on page 107, the 
standard deviation of the means of a rectangle is ~/(k? — 1)/12, 
where k is the length of the rectangle,! while the standard 
deviation of the variables in a rectangular distribution (which is 
the same as one in which the number of subdivisions is indefinitely 
large) is ~/k?/12. Using, then, for rxz the standard deviation of 
the means over the standard deviation of the variates, we have 


Read (Correlation of variates with either 
Tya = al —s— means or index values in a rec- (215) 
k2 tangular distribution) 
1 Kelley (Statistical Method, p. 267) gives as the standard deviation of a 
rectangle Bes iD But this is incorrect, and his table of values of 7’s 


on p. 268 (of Statistical Methods) is also incorrect as far as the column for 
rectangular distributions is concerned. 


398 STATISTICAL PROCEDURES 


In correcting r for broad categories where rectangular distribu- 
tions are involved, we substitute this value for one or both of 
the 7’s between means and variates called for by formula (211). 

If the width of intervals differs for different parts of the dis- 
tribution, we must actually compute the o of the means or of 
the index values. But, if the intervals are either known to be 
equal or may be assumed to be equally spaced and if we assume 
a certain known type of distribution, then the r’s may be tabled 
once for all. Thereafter, knowing the number of categories 
into which the distributions are divided, it is only necessary 
to refer to such a table to obtain the 7’s required for the denomi- 
nator of the fraction in formula (211). We shall give such table 
below. Because sometimes the number of categories is the same 


Taste XXXIII—Corrricmants OF CORRELATION BETWEEN VARIATES 
AND MEANS OR INDEX VALUES 


Between means Between index peryaen ae 
RONA and variates, normal | values and variates, ie eae pi a 
Poo distribution normal distribution | V® Ve% Tectangular 
categories distribution 
Taz! Tea! Tox! Tea! Taz! Tes! 
-798 -637 816 667 -866 «750 
859 -738 859 737 2943 889 


in both distributions and sometimes different, we shall give both 
r and 7?, the 7? being needed in the former case and the product 
of two different 7’s in the latter. The 7s for means were com- 
puted by formula (212), using a range of six sigmas along the 
base line. Those for index values were computed by formula 
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(214), and those for rectangular distributions by formula (215). 
The reader will note with interest, and perhaps surprise, the 
extremely close parallelism between the r’s for means and 
variates and those for index values and variates. 

We shall now return to the correction for broad categories 
of the C we computed for the Burgess and Cottrell data. An 
examination of the correlation table on page 415 reveals a very 
peculiar distribution. Certainly it is not a normal distribution, 
and neither is it rectangular. It is somewhat triangular in 
shape. Since we have no ready-made formula for correction in 
the case of triangular distributions, we would not be far amiss by 
viewing it as approximately the same shape as one-half of a 
normal distribution. If the regression between means and 
variates in a normal distribution is rectilinear, as we have been 
assuming, the slope of the regression line will be the same for the 
lower half as for the whole distribution. We shall therefore do 
well enough by using the correction for a normal distribution, 
provided we treat the number of categories as eight rather than 
four. Referring to the table for the r between means and variates 
in case of eight categories, we have 


25 


& = 953 = .262 


If we had treated the distribution as rectangular, with four 
divisions, our result would not be very different; we should have 
been required to divide the .25 by .938 and should have obtained 
.266. 


QUANTITATIVE VARIATES, UNEQUALLY SPACED INTERVALS 


The type of situation to which we next wish to apply a correla- 
tion technique is one in which we may plausibly think of the 
variates as quantitative in nature but in which we have insuffi- 
cient reason to believe the intervals into which the distributions 
are divided are of uniform length. In addition the number of 
categories is likely to be rather small, so that correction for 
broad categories is needed. As a concrete illustration of this 
type of problem we shall use part of an investigation by Fred F. 
Lininger on the relation of milk drinking to various types of 
physical and mental growth. The correlation table below shows 
on the X axis three categories for amount of milk consumed while 
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on the Y axis are laid off amounts of gain in weight during the 
school year. The study was conducted in the public schools of 
Philadelphia. 


Taste XXXIV.—Exrent or CONSUMPTION or Minx IN RELATION TO 
Gary In WEIGHT 


Gain in Location 
weight, lb. ag enor outa of mean 
age 
f f 
Over 6 9 69 140 | 8.8 | +1.815 
46 64 242 533 | 33.1 | +0.698 
1-3 153 364 798 | 49.6 | —0.474 
0-0.99 29 48 102 | 6.3 | —1.637 
Loss. Wi tes 16 35 | 2.2 | —2.386 
Totals. 268 739 1,608 
Percentage.....| 16.7 46.0 z 
Bente ae fi —0.280 | +1.013 


We cannot consider the distance from the central point. of 
“no milk” to ‘‘milk at home only” one step and that from 
“milk at home only” to “milk at home and school” another step 
of the same length. We do not know at all the relative lengths 
of these steps. In order to get some quantitative index for these 
distances, we must make some assumption about the nature of 
our distributions. It does not seem unreasonable to assume a 
normal distribution of children in respect to the amount of 
milk consumed by them. Since in this distribution we know the 
proportion of cases in each of the three categories, we can find 
the distance of the mean of each of the sectors from the mean 
of the distribution as a whole by a method discussed under our 
treatment of the normal curve, page 290. If the distance from 
the mean of the sector to the mean of the whole distribution be 
designated as z, the height of the bounding ordinate at the left 
of the sector by zı, and the height of the bounding ordinate at the 
right of the sector by zz, then 


_ %1 — 22 
area 


In the sector on “milk at home and school” zı = 0.3789 as 


given in the table, page 481, for a q of .374. Here zs is zero, 
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because this sector extends to the upper end of the distribution. 
Henge 
sa 0.3789 — 0 
374 


For the middle sector zı = 0.2302, as shown in the table for a 
q of .167, and zz = 0.3789 as before. Hence the mean of this 
middle sector lies at a distance from 
the mean of the whole distribution only 46% 


= 1.013 


Home and. 
school SIAN 


_ 0.2502 — 0.3789 _ Norin ( 
n= T0 BK 


In a similar manner the other « Fia. 31. 
value, and all the y values are found. 

We now compute the coefficient of correlation in precisely 
the same manner as a Pearson r for any other correlation table. 
The procedure does not differ at all from that deseribed in 
Chap. IV. As the result we get 


Dey 217.08 


r= No.) = 1608(.800)(023) ~ “1° 


But this 7 stands in considerable need of correction for broad 
categories, especially in respect to the « variable where there 
are only three categories. To make this correction, we need to 
divide the obtained r by the product of the r’s between means 
and variates as indicated in formula (212)—for we have been 
working with our data centered about means of intervals rather 
than about index values. We shall forego looking in our table 
of these r values, page 398, because the intervals may not be 
sufficiently equally spaced. According to formula (212): 

Oz — ou 
Tzv = ap and Tw = ay 

We have already computed sz and øy; they are the standard 
deviations of the means obtained in connection with our solution 
above. Since we have assumed a normal distribution and are 
working with proportions with the aid of our integral tables, the 
standard deviation of the variates will be 1 in each of the dis- 
tributions—for our table upon which we drew for x and z values 
is based on the assumption of unit area and unit standard devia- 
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tion. Therefore 
Taz = F = o> = 890; and ty = F = oy = -923 


Our corrected r then becomes 


.164 
(890) (.923) — 


Had it not been for the fact that we wished to illustrate fully 
the principles at stake, we could have saved several steps, and 
yet secured precisely the same result, by applying directly 
formula (213a) instead of proceeding by way of formulas (211) 
and (212). 

If we had assumed our intervals equal in span and had obtained 
our values from Table XX XIII for the r’s between means and 
variates, we would have had 


614 
(859) (943) 


But this closeness of approximation is somewhat accidental; 
unless the intervals are actually spaced equally, we would not 
always come as near the correct value by using the table. 


r= -200 


= = 202 


Exercises 


1. By dividing the students in Table IV into those “high” in general 
intelligence (score 100 or above) and those “low” in this function (below 
100), compute a biserial r between intelligence-test standings and grade- 
point averages. Compare this with the Pearson product moment r. Com- 
pute the P.E. of the product moment 7 and compare it with the P.E. of the 
biserial r. 

2. By dividing both distributions at some convenient point, compute 
tetrachorie r for the same two factors as in Exercise 1, and compare with the 
two 7’s computed in Exercise 1. Compute the P.E. of this r. 

3. Compute this r from broad categories; i.e., group the intelligence scores 
into, say, four intervals and grade-point averages into five, determine r, 
and correct it for broad categories. 

4. Compute an r for the Burgess and Cottrell data on page 415 by the 
methods described on pages 391 to 393. 

5. Out of a total population of 475 college seniors, 71 were selected by a 
guess who test as most outstanding in social leadership and another 69 as 
least effective in social leadership. Of the high ones 27 had taken more than 
six credit hours of history and 44 had taken 6 hr, or less, whereas of the low 
ones 14 had taken more than 6 br. and 55 less, Compute the coefficient 
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of correlation between amount of history taken and effectiveness in social 
leadership. Of these same students 26 of the high and 34 of the low had 
taken more than 6 hr. of physical science while 45 of the high and 35 of the 
low had taken less. Compute a similar r between extent of study of physical 
science and leadership. 
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CHAPTER XIV 
CHI SQUARE 
THE NATURE OF x? 


In recent years a great deal of attention has been given the 
x? test developed by Pearson.! The situations to which this test 
may be applied are of the type where we have both theoretical 
and observed measures and wish to know whether differences 
between these measures can reasonably be regarded as chance 
variations. Provided the true variance within each class can be 
known or estimated, the x? technique can be applied to a number 
of different types of problem to measure the probability of getting 
a given divergence in a sample from corresponding theoretical 
values in the parent population. One use is the testing of fit 
of a normal curve to the sample population; and it can be applied 
to all forms of curve fitting where we can know the distribution 
of the classes to the means of which we are attempting to fit 
the curve. Another important application is to test for associa- 
tion between variates in a contingency table. 

In general, if x is a value in the form of a deviation from the 
mean of the whole population of its class and, in the infinite 
population, the variates are normally distributed, then x? is 
defined by the relation 
za? 


a 


2 


x 


where 63 is the true population variance of the class. The z 
may be a single measure, or it may be the mean of a sample, 
or a proportion, or any other statistic normally distributed. 
It is evident that x? is related to s?, the variance of the set of n 
statistics; for, if the deviations are taken from the population 
mean of each class and are summed through the set, Dz? = ns?. 


1 Pearson, Kart, “On the Criterion that a Given System of Deviations 
. . » Can Be Reasonably Supposed to Have Arisen from Random Sampling,” 
Phil. Mag. (London), Vol. 50, pp. 157-175 (1900). 
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Hence, if the 2 is the same for all the classes of which x? is 
made up, 


Memes 
E 


Zr? _ ns? 
a: 

Tn this chapter our interest is in the distribution of x? from 
samples. 6; will always be the same for a given population. 
But if the null hypothesis is assumed (that each class is a random 
sample from the same homogeneous population), s? is an unbiased 
estimate of ¢?, Therefore, the ratio s?/é2 will sometimes be less 
than 1 and sometimes greater; as the sample is increased in size, s? 
will approach 62, and x? will approach n. For certain purposes 
which will appear later, we need to know the shape of the x 
distribution and its area between certain ordinates. It may be 
said at once that this distribution is not that of the normal curve, 
except in one special case. 

Suppose, now we consider our 2’s one at a time. Then each x? 
will be merely «?/é3. The probability of getting a x? of any 
particular value will be the same as the probability of getting 
an 2/6? of that same value. If we write z? to denote this value 
of x?/é?, the probability that a value z, will lie within an elemental 
range dz, would be merely 


2an 
e 2 dzı 


T 
df = 
if Von 
The probability of getting a x as great as, or greater than, a given 
value would be the integral of the (normal) distribution function, 
viż., 


aa zit 

a Vm 

the value of which can be found in the normal probability tables. 
Let us now deal with the probability of getting two given 

independent values of x simultaneously. That probability 

is the product of the two probabilities of getting them separately. 


Hence the probability that these two variates will occur con- 
jointly within the same cell (elemental area) dzıdzą becomes 


ant a 2 
df = len e a) (Gee ža) = o e—hette) dz dzy 
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and the probability of getting simultaneously, in an infinite 
supply of samples, values as great as, or greater than, these 
two would be 


If we write x? = 2? + 23 and concern ourselves with the prob- 
lem of calculating this probability, we are confronted with the 
fact that the value of x? may be the same although zı and zz 

3 may vary from sample to sample. 
'XdXd@ We have here the equation of a circle, 
center at the origin, with a radius 
equal to x, so that the elemental area 
representing the joint probability 
may be in any position on an ele- 
mental circular region of x as a 
radius. In order to state this fact 
ates: more exactly and to evaluate the 
double integral, it is convenient to 
resort to polar coordinates (Fig. 32). 

The elemental area dzıdzą becomes! in polar coordinates 
x dx dð. Substituting x? for 2? +2 and xdxd for dzıdzz 
above, we have i 


1 2 
df = They d 
if tole nx dx do 


If we denote by df the total probability of the occurrence of 
x within the circular region, we must integrate the above expres- 
sion with respect to from 0 to 2r. Integrating with respect to 


: 1 H 
4, noting that (Je) xe xd remains constant, 


«(aoa 


or, 


(B) df = (i) xe” Fdx(2n) = PN 


See Kenner, J. F., Mathematics of Statistics, Part II, p. 37, for a simple 
statement of the reason for this. 
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The probability of obtaining a value of x as great as or greater 
than a given xı would be expressed by 


P= i Aig ct 


This last integral is of the form e*dv (see page 24) and may be 
integrated directly. Performing this integration, we have 


or, 
(C) P=e 2 


For a given value of x obtained from two independent values of x, 
we could calculate P from this formula. Thus if x, = 1, we 
would obtain, upon substituting in formula (C), 


SORRU 1 
P=e 2? = = ——_ 
Ve V/2.718 


This tells us that the chances are slightly more than 60 in 100 
that, we would obtain at random a value of x as great as, or 
greater than, 1 when the value of x? is made up of the sum of the 
squares of two independent variates. 

If we are concerned with the probability of the simultaneous 
occurrence of three independent values of z, we would have a 
triple integral corresponding to (A); or for n values, we would 
have an n-fold integral. 

Formula (B) expresses the x distribution function in the case 
of two independent variates—the two independent quantities, 
distributed normally about zero, that make up xê. It will be 
noticed that the exponent of x in this same formula is 1—one 
less than the number of independent variates. If we were to 
go through a similar process for three independent variates 21, 22, 
Z we would arrive at the formula 


df = kx’e-**dx 


= .6065 


in which k is a constant, and the exponent of the x factor is 2. 
In this situation x may be interpreted as the radius of a sphere, 
and the elemental region as a spherical shell of thickness dx. 
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When the number of independent variates exceeds three, it 
becomes impossible to visualize the meaning of the probability 
of x within an elemental region, and we say that we are dealing 
with the geometry of hyperspace. The mathematics, although 
more complicated, is carried out in a manner analogous to that 
which we have already developed above, and we are able to 
obtain the general formula for the distribution of x. When x? 
is defined by the equation 


VaAbatrat ++ % 


the element for the distribution of x becomes 


xt 
(D) df = kx""e ?dx 
in which k is a constant determined mathematically to be 
pan laa 
(Ages Vaya 
5 12 


Notice again that the exponent of x is (n — 1)—one less than 
the number of independent variates that went to make up the 
quantity x2. Formula (D) is the x distribution equation, cor- 
responding in concept to the normal probability equation. One 
difference that should be pointed out is that the x equation is 
a function of n as well as of x, whereas the normal equation is a 
function of z alone. This is because the normal probability 
function is a special form of the x function resulting when n = 1, 
as may be seen by putting n = 1 in formula (D). 

Suppose, now, in a two-variable problem our operation were 
so restricted that (xı + 22) would need to sum to a fixed amount. 
Then when we had the probability of getting either xı or 22, 
that of getting the remaining one would be exactly the same. 
There would be only one degree of freedom instead of two. If 
there were n values, but so limited that they had to sum to a fixed 
amount, then the probability of getting a certain value for the 
set of n would be exactly the same as the probability of getting 
the appropriate value for the (n — 1) independent ones. There 
would be, i.e., (n — 1) degrees of freedom from which the proba- 
bility would be determined. Thus, for every restriction that 
brings it about that a remaining term is determined when the 
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others are known, the number of degrees of freedom is reduced 
by 1. So the multiplicity of the integral is not n, the total 
number of terms, but (n — a) where a is the number determined 
when the others are given. i 

Thus in principle it would be simple enough to determine what 
is the probability of getting a x value as great as or greater than a 
given value; we would need only to integrate the product normal 
probability function as many times successively as we have 
degrees of freedom from the several indicated limits to infinity. 
But in practice this would be an impossibly arduous task and 
would be wholly impractical as a procedure. By resorting to 
generalized polar coordinates the evaluation becomes a perfectly 
straightforward task and may be carried out by integrating 
successively by parts, care being taken to consider separately the 
cases when n is even and when n is odd. Since the task is long 
and tedious we shall leave it to the ambitious student as an exer- 
cise and simply write down the resulting formulas. 

When n is even, the expression for the probability of obtaining 
by random sampling a value of x equal to or exceeding a given 
value is found to be 


es 2 
pos [a+ (be) +e) brio 
a ELEA G 2 fi (216) 
[im — 2)/2]! \2* 
When n is odd, 


a 


As soon as x is known either Eq. (216) or (217) can be evaluated 
after substitution and the value of P discovered. Even this is 
too complicated to use in practice, so Elderton, working with 
Pearson, tabled its values. Since the distribution is different 
for different n’s, the values are given not only for different x78 
but also for different n’s. We give Elderton’s table on pages 498 
to 500. Later Fisher also tabled the x? values in a different 
form; and still later Kelley and others also did so. 
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THE COMPUTATION AND USE OF x? 


Now that we have given the x distribution and have shown 
how the formulas for P are obtained, we proceed to show how the 
value of x? is found in practice and how the x? test may be 
applied. We saw that the probability of obtaining together 
values of zı and zə within the cell dzıdz is expressed by the 


A z 
quantity (o e-i(e+2")]z.dz. when the two variates are 


assumed to be distributed normally about zero and are uncorre- 
lated. In the more general case where zı and zs may be corre- 
lated, the product function would be, as shown on page 368, the 
following: 
1 1 a? xe EZEZ] 
dz = ne Lira Got) Meese 

For the generalized case (n large and correlation present) 

Pearson gives the formula 


1 App tp? Apa Xp te 
Pu O (218) 


in which zo is a constant and the expression in parentheses involves 
the summation of terms containing correlation determinants and 
\the n variates and standard deviations, the first term to be 
summed for all values of p from 1 to n, and the second for all 
pairs of values of p and q in which p is less than g. Pearson 
defines the quantity within parentheses to be x? and proceeds to 
show in his mathematical development that for the type of 
application most frequently made of x?, this complicated quantity 
can be expressed in terms of weighted squared deviations between 
theoretical and observed frequencies. The proof is so compli- 
cated that we do not deem it advisable to include it here. The 
formula for x? in terms of theoretical and observed frequencies is 


LEANE (Chi square for goodness of 
= >? (fo = fi)? fit of observed to theo- (219) 
ti retical frequencies) 
in which fo and f: are the observed and theoretical frequencies, 
respectively, in each group, and the summation extends over all 
groups. 


In practice we have only to compute x? from formula (219), to 
make certain of the number of independent variates (degrees of 
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freedom) that contributed to its value, and to determine from 
tables the probability of getting a value as large or larger on the 
basis of random sampling. Let us now consider a dice-throwing 
experiment in order to make more concrete the meaning of P by 
using a very simple illustration. 

The authors threw 12 dice in a group 14 times and recorded | 
the number of aces appearing in each throwing. Assuming the 
dice to be balanced perfectly, we should expect theoretically two 
aces to appear at each throwing; but, of course, this perfect 
record was not obtained because of the influence of chance or 
other factors. There were differences between observed fre- 
quencies and those to be expected theoretically; and the question 
arose as to whether these differences were so great as to lead us 
to believe that the dice were biased. The actual number of 
aces appearing among the 12 dice at each throwing (fo), the 
theoretical number expected (f,), the deviations between observed 
and theoretical frequencies (fe — f:), these deviations squared 
(fo — fi)?, and the weighted squared deviations (fo — fi)*/f: are’ 
shown in Table XXXY. 


Tasty XXXV.—Numper or Aces APPEARING AMONG 12 Dicw IN 14 


‘THROWINGS 
Theoretical Deviations ; , 
No. of aces | No. of aces |between theo-| Deviations ee 
appearing at | expected at | retical and squared Gin i )2 
each throw | each throw observed (fo — fi)? oe 
(fo) G) (fo — fd fi 
1 2 -1 1 4 
3 2 1 1 } 
2 2 0 0 0 
3 2 1 1 } 
1 2 £T 1 4 
4 2 2 4 2 
2 2 0 0 0 
4 2 2 4 2 
a 2 =1 1 4 
0 2 =2 4 2 
3 2 1 1 4 
2 2 0 0 0 
3 2 1 1 3 
1 2 -1 1 3 
30 28 2 x? = 10 
UA ISR Ma NAA eae 
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The value of x? is seen to be 10, made up from the 14 independ- 
ent variates or deviations. Entering the tables with a x? = 10 
and n = 14 (14 is the number of independent variates or degrees 
of freedom), we find that P = .762. This means that we should 
expect to find a value of x? equal to or greater than 10 in more 
than 76 out of 100 cases on the basis of chance, and this probabil- 
ity is so large that we do not have reason to believe that the dice 
were biased. 

Degrees of Freedom.—At this point it might be well to make 
further mention of the meaning of degrees of freedom or number of 
independent variates. In the dice-throwing experiment the num- 
ber of aces appearing in any of the 14 groups was independent 
of the frequency in any of the other groups. Hence we must 
take n = 14 when entering the tables to find P. In the examples 
that are to follow, we shall see that the number of independent 
variates or degrees of freedom is not necessarily the same as the 
number of groups used in the computation of x*. If, for exam- 
ple, we had ten groupings of deviations between observed and 
theoretical frequencies and the total number of frequencies was 
the same in each sample, there would be only nine independent 
variates or degrees of freedom since the tenth group contribution 
could be found by subtracting the total of nine groups from the 
grand total. 

How many degrees of freedom obtain in a given application 
of x? depends upon what sort of universe of samples one has in 
mind. If, in a contingency table, one is asking his question 
about the sampling fluctuation in that set of samples in which 
the marginal totals remain the same sample after sample, there 
are (k — 1)(r — 1) degrees of freedom, where k is the number of 
columns and r is the number of rows, because the necessity of 
constant totals for each row and for each column restricts the 
fluctuation in each column to (k — 1) cells and in each row to 
(r — 1) cells. It was upon this interpretation that Fisher 
fastened in his epoch-making article.! It was only that limita- 
tion which made exact mathematical treatment possible, since 
Pearson’s original development hinged upon known theoretical 
values, which could only be afforded in a sampling scheme if the 
marginal totals remained the same for the whole supply of 


1 Fisumr, R. A., “On the Interpretation of x? from Contingency Tables,” 
J. Royal Statistical Society, Vol, 85, pp. 87-94 (1922). 
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samples and hence was the same as that of the population 
sampled. But one may, and in most practical research would, 
wish to ask his question about the sampling fluctuation in all 
random samples of the same N, not only in that small portion 
of samples in which the marginal totals remain constant. In 
our illustration of Table XXXVII, for example, the normal 
expectation is that neither the relative numbers taking graduate 
work, college, etc., nor the numbers in the various categories of 
marriage adjustment would remain the same in successive sam- 
ples, but that these marginal totals would fluctuate from sample 
to sample as well as the frequencies in the several cells, the total 
population of the whole sample alone being fixed. Here it 
would be only the nth cell that could be filled in from a priori 
knowledge, and the number of degrees of freedom would be 
(n’ —1) instead of (k —1)(r — 1). In a relatively recent 
article Karl Pearson! has shown very clearly and convincingly 
that the number of degrees of freedom for this interpretation is 
(n’ — 1), even though the theoretical values are estimated from 
the sample. He presents conclusive experimental evidence 
that the x’”s thus obtained have very closely the same mean 
and very nearly the same standard deviation as the x”s obtained 
from known theoretical values and that the correlation between 
x” and x? in a number of trials is very high—from .93 to .99. 
Thus the number of degrees of freedom must depend upon one’s 
meaning: if he is talking about the general case in which the 
samples may vary in every respect except N, the number of 
degrees of freedom is (n’ — 1), where N is the total population 
and n’ is the total number of cells; if he is talking about the 
special case in which the marginal totals are to remain constant 
through all the samples, the number of degrees of freedom is 
(k —1)(r — 1). In most statistical work this distinction has 
been ignored; workers have followed for all purposes Fisher’s 
lead in using (k — 1)(r — 1) indiscriminatingly. Since that is 
now the established custom, we shall follow it here in our illustra- 
tions in order to avoid confusion. But careful workers should 
make and apply the indicated distinction. In the article referred 
to, Pearson shows that the same principle applies in fitting an 
empirical distribution to the normal curve. If one fits a curve 


1 Pearson, Kart, “Experimental Discussion of the (x, p) Test for Good- 
ness of Fit,” Biometrika, Vol. 24, pp. 351-381 (1932). 
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we take as the probability of obtaining a score in that row 424. 
The probability that a score will lie in the first column is taken 
as #8. The probability that a score will lie in the first row 
and the first column—the upper left-hand cell—will then be the 
product of these probabilities or ($2$)(48s). To obtain the 
number of frequencies to be expected in that cell, we must 
multiply the probability by the total number 513. We have 
($23) (S) (513) = (11.9). The other theoretical frequencies 
in parentheses are found in a similar manner. 

Let us now determine x? for the contingency table dealing with 
the relationship of marriage-adjustment scores and husbands’ 
education. Here there are four columns (Very low, Low, High, 
and Very high) and four rows (Graduate work, College, High 
school, and Grades only). Since from the marginal totals we are 
able to compute the fourth row or column, knowing the three 
others, we have (4 — 1)(4 — 1) = (3)(3) = 9 degrees of freedom, 
if we are making the customary interpretation that the marginal 
totals remain fixed. 

The theoretical frequencies for each cell have already been 
calculated and are found in parentheses in Table XXXVII. 
We now display in Table XXXVIII the deviations between 
theoretical and actual frequencies and the weighted squared 
deviations for each cell. 


Taste XXXVIII.—DEVIATIONS BETWEEN THEORETICAL AND OBSERVED 
FREQUENCIES [fo -=Í AND WEIGHTED SQUARED DEVIATIONS 


(fo = fe)? G 
fi FOR THE ConTINGENCY Taste XXXVII 
Marriage-adjustment scores in relation to 
i husbands’ education 
Education Totals 


Very low Low High | Very high 


Graduate work...... 7.9 ( 5.3)| 8.8 (4.3)| 8.2 (2.5)| 8.5 ( 1.7)| (13.8) 
College........ -3.1 (0.4)| 3.6 (0.3)| 2.9(0.1)] 9.6 ( 1.0)| ( 1.8) 
High school,........|5.8( 1.9)| 11.2 (4,8)| 1.9(0.1)/15.1( 4.3) (11.1) 
Grades only......... 5.2" 4.6)! 1.4(0,2)) 3.4 (0.8)| 3.2 ( 4.6)| (10.2) 
Totals: es Enae ona E (12.2) (9.6) (3.5) (11.6)| (36.9) 


The numbers given in parentheses are the squared deviations 
divided by the theoretical frequencies for the cells. In the 
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upper left-hand cell, for example, we have 
(fo — f) = 4 — 11.9 = —7.9 


the deviation, and (fo — fi)?/f: = (—7.9)?/11.9 = (5.3), the 
weighted squared deviation. The other numbers in parentheses 
are computed in the same way. The number (36.9) appearing 
in the lower right-hand corner of the table is the sum of all the 
squared deviations weighted for the theoretical frequencies and 
is, therefore, our value of x*. We enter the tables with x? = 36.9 
and n =9 and find that P = .000142. This value of P is 
so small that we cannot attribute the differences to errors in 
sampling, but must believe there is an association between 
marriage adjustment and husbands’ education. 

Let us now apply the x? test to the normal-curve graduation 
data for the 149 sophomore scores on the Carnegie Foundation 
Tests, 1930, found in Table XXV. The differences between 
the theoretical curve and the histogram shown in Fig. 21 reveal 
that in some intervals the curve calls for greater frequency and 
in other intervals less frequency than actually exists in our data. 
The question naturally arises as to what extent the superimposed 
curve truly represents the data in question. The x? test works 
very well in testing goodness of fit; for, if the probability is so large 
that we may obtain on the basis of chance a value of x? as large 
as, or larger than, the one in hand, we may reasonably conclude 
that the fit is a good one. On the other hand, if the probability 
is small, we are unable to account for the difference on the basis 
of chance fluctuation and must conclude that the curve is not 
representative of our data. 

The chi-square technique is not sound unless the numbers in 
the cells are reasonably large.! For this reason it is customarily 
advised that cells with small frequencies be combined. Since 
the upper 3 and the lower 3 intervals of Table XXIV contain 
ten or fewer theoretical frequencies, we have lumped these 
extreme tails into 2 intervals, making 8 intervals for our data 
instead of 12. The theoretical frequencies of the intervals are 
determined in terms of the normal-curve function and the N 
of this sample, as shown on page 418. The parameters in terms 
of which this sample is fitted to the normal curve are N = 149, 
mean = 215.4, and o = 50.9. 

1 See KENNEY, op. cit., p. 170, for a simple statement of the reason for this. 
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Tasnn XXXIX—Txe COMPUTATION OF x? FOR THE NorMAL-CURVE 
Grapvation or Taste XXIV 


eS ee 


Frequencies ae) ag 
Interval |__| (f. —fi) | (fo — fi)? hein 
5 Jo fi 

279 ,5-839.5 16 14 2 4 29 
259 .5-279.5 12 14 —2 4 .29 
239. 5-259.5 19 19 0 0 -00 
219. 5-239. 5 26 23 3 9 .39 
199. 5-219. 5 22 24 —2 4 17 
179.5-199.5 18 20 —2 4 .20 
159. 5-179. 5 14 16 —2 4 -25 
99.5-159.5 22 19 3 9 47 

Dopals sje co Vela wees 149 149 0.00 "A x? = 2.06 


TSG Se ee ee a eae 


The x? is 2.06. We next wish to know the probability that 
so large a x? could arise from this type of situation merely on 
the basis of chance fluctuation. With what number of degrees 
of freedom shall we enter the table? The same dual interpreta- 
tion is possible here as in the case of contingency tables, discussed 
on page 412. If we mean how frequently would chance fluctua- 
tion give rise to a x? as large as 2.06 in a sample of 149 scores when 
only the size of the sample remains constant, the number of 
degrees of freedom is (n’ — 1) = (8 — 1) = 7. Entering Table 
XLVIII with n =7 and interpolating between x? = 2 and 
x? = 3, we get P = .95. If we mean to ask about the P for that 
universe of samples which continues to have, sample after sample, 
the same mean and the same g as the initial one as well as the 
same N, the degrees of freedom are (n’ — 3) = (8 — 3) = 5. 
For this the P is .84. Both of these indicate a very good fit. 
The former means that, even if the function were distributed 
perfectly normally in the whole population, as great departure 
as we obtained or greater would occur in samples 95 times in 100. 
The latter means that, even in that more restricted sampling 
in which the mean and the standard deviation as well as the size 
of the sample remain constant, so great a discrepancy would occur 
by chance 84 times in 100. We may, therefore, feel no hesitancy 
in believing that the distribution would be normal except for 
chance fluctuation. In fact, the fit is unnaturally good; the 
most likely P for a true fit is about .50. 
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x? VALUES OUTSIDE THE RANGE OF OUR TABLE 


Our table, following Elderton, extends from n = 2 to n = 29. 
For n = 1, x is distributed as half of a normal distribution, as 
shown on the opening pages of this chapter. So, forn = 1, look 
in our normal distribution table, pages 485 to 487, under x = £/sz 
and obtain the percentage in the tail of the distribution, then 
multiply this by 2. For example, x? = 4; x = 2; in the table 
for z/o = 2,q = (.50 — .4772) = .0228; 

P = (2)(.0228) = .0456 
This is the P corresponding to x? = 4. 

For applications where n exceeds 29, Fisher has proposed that 
we assume that ~/2x? may be treated as a normal deviate about 
4/2n — l as mean.t Example: x? = 50; n = 4154/82 —1=9. 
/22 =10. t=10-9=1. Looking in our table for 

EARE 
Oz 


we find that P = (.50 — 3413) = .1587. 


RELATIONS AMONG x’, F, AND z 
Recall, from our opening paragraphs, that 


Er? ns? 
(E) Fatale sas ra 


i.e., the denominator must contain the true population variance. 
But sometimes this cannot be known and must be estimated. 
Then the probability of a given value for the fraction as a whole 
is a resultant of the probabilities of getting independently the two 
sample values of numerator and denominator instead of that of 
the numerator alone. The fact that now both numerator and 
denominator are sample estimates changes the shape of the 
distribution from that of a Pearson type III curve to a type VI 
curve.? For the ratio of the two sample variances Fisher uses 


1 By differentiating Eq. (D) with respect to x, equating the derivative to 


zero, and solving, we find the modal value of ~/2x? to be +/2n — 2. 
Because the distribution is skew, the modal value would differ somewhat 


from the mean value. 
2 Fisnmr, R. A., “The Goodness of Fit of Regression Formulae, and the 
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e?? and Snedecor uses F, so that 


P) eara4 


Sel 


The distribution of z or of F is found by using the product prob- 
ability principle. From Eq. (2), $ = x26?/m, and s} = x36°/ns. 
Then F = (nox?)/(m1x3), or x? = (nmı/n:)xżF. Now using the 
gencral x? distribution, we are able to write down the simultane- 
ous distribution of x? and x3. After writing down this com- 
plicated product function, we substitute for xj, the value 
(nı/na)xżF, integrate for the whole range, and finally obtain the 
distribution 


= (te ae oe 
[art ma = 2)/2) p 2n? emda J 
ny ask, 2 1 Me czy 2 1 a 
Oo y ea Tee (nie + n) 


af =2 


The probability integral for the distribution of z or of F yields 
very complicated expressions which must be evaluated for 
different values of nı and mz. Tables for z for the 5 per cent and 
the 1 per cent points of the distribution were made by Fisher for 
nı = 1, 2, 3, 4, 5, 6, 8, 12, 24, and ©; and for mz values from 1 
to 30, together with 60 and œ. Snedecor made corresponding 
tables for the distribution of F. 


STUDENT'S ¢ 
Student’s ¢ is defined by the relation 
@) =: 


Sz 


where « is a deviation of a statistic from the true value in any 
application in which the 2’s may be assumed to be normally 
distributed and s, is an estimate of the standard deviation of 
the 2’s made from the sample. But, from (E), 


se = X©, whence t = —Ž zvn. 1 


vn a/n E x 


Since ë and +/n are constants, the distribution of ¢ is found by 
first writing down the simultaneous distribution for x (normal) 


Distribution of Regression Coefficients,” J. Roy. Statistical Soc., Vol. 85, 
p. 601 (1922). 
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and for x (as defined above), as is done in the case of the z 
distribution, substituting for x in terms of t, and performing the 
proper integration. The distribution turns out to be 


= 2\ iat) 
(H) y-—* (148) di 


n 


The definition of t in Eq. (G) is more general than that with 
which Student himself dealt. He dealt with only the case of the 
mean divided by an estimate of its standard error. But Eq. (G) 
assumes that the distribution is general for the whole class of 
statistics in which deviations from the true value make a normal 
distribution, and Fisher has proved that this is true. The dis- 
tribution of ¢ is applicable “to all cases which can be reduced to 
a comparison of the deviation of a normal variate with an 
independently distributed estimate of its standard deviation, 
derived from the sums of squares of homogeneous normal devia- 
tions, either from the true mean of the distribution or from the 
means of samples.”! We give tables for the probability integral 
of t on pages 173 and 488. 


é IN TERMS OF F 
In Chap. XI we made use of the distribution of e when the true 
7? is zero and cited a table we had constructed for this purpose. 
We shall here derive the formula upon which that table was built. 
On page 333 we showed that in the sample 


o = ot +o}, 
Substituting population estimates for sample values 
to Lgart at ra 


where s2, is the estimate of the population variance from the 
means of classes. Dividing through by s2(N — 1)/N, and remem- 
bering that kn = N, 


eel E es 
# N-1'N=1 8 
1 Metron, Vol. 5, p. 94. 
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But in the case where all variances are assumed to be estimates 

of the same homogeneous population variance (7.c., where the 

null hypothesis is being tested), s?,/s2 = F. Therefore 
AUENA iy au Ya 

() SRNE I ANET 

Now, by definition, e = 1 — s?/sj. Hence, by algebraic manip- 

ulation and then substitution in (J), 


Eg E REE Vim Vey wee 
me a a e eh TE 
Multiply through by (1 — &)(N — 1), transpose, and solve for 
e, 
(N= 1)=(N —k)-— (N — k)e + (k —1)F — (k — 1)Fe 
(N — k) + (k — 1)Fe = (k — 1)F — (N — 1) + (N — k) 
= (k — 1)F — (k — 1) 
(k —1)F — (k — 1) 
Bg AEAEE JA AN E ae ae 
t= GF + WS) (220) 
We constructed our table of the distribution of e by sub- 
stituting for the values of F at the 1 per cent and the 5 per 
cent positions at the various N and k levels. This is Table 
XLVII, pages 494 to 497. 


A RECENT APPROACH TO SAMPLING DISTRIBUTIONS 


The whole matter of sampling distributions will probably be 
given reorientation in terms of some recent developments which 
bring all the issues we discussed in this chapter together into 

_very simple perspective. Huntington! has recently shown how 
to write in general terms the distribution of the quotient between 
two independent statistics when we know the sampling distribu- 
tion of each of them. Suppose z is a variable distributed in 


accordance with a probability law Jee fi(z)dz = 1, and yis a 
variable distributed in accordance with the probability law 
J, i foly)dy = 1, x and y being independently distributed. Then 


1Huntineton, E. V., “Frequency Distributions of Product and Quo- 
tient,” Ann. Mathematical Statistics, Vol. 10, pp. 195-198 (1939), i 
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the quotient w = x/y will be distributed according to the law 
+e ; 
e Q(w)dw = 1, where 


Qu) = fy Aeophu dy 


This integral may be evaluated as soon as we are able to 
insert the values from the distributions of the separate statistics, 
xandy. There are three cases as follows: 

1. Where z has a gamma distribution and y is a constant. 
This is x?, developed by Karl Pearson in 1900. 

2. Where x has a normal distribution and y has a gamma dis- 
tribution. This is t, developed by Student in 1908 but previously 
discovered by at least two other mathematicians. 

3. Where both x and y have gamma distributions. This is 
Fisher’s z or Snedecor’s F, developed by R. A. Fisher about 1924. 

We know the distributions of « and y for all these cases, pro- 
vided the samples are independent random ones drawn from 
variables normally distributed in the parent population. His- 
torically the sampling distributions of x and y were derived 
by use of hyperspace geometry, by the technique of generalized 
polar coordinates, and the presence of certain expressions like 
“ degrees of freedom” comes over from this geometric approach. 
But Dunham Jackson! has recently shown how to derive these 
equations by purely analytic (algebraic) methods. This recent 
work by analytic methods makes possible simplification and 
unification of the concepts and processes involved in the topics 
of this chapter in a manner that is in marked contrast with the 
intricacies and the difficulties involved in their historical develop- 
ment. Butitis beyond the scope of this volume to follow through 
these derivations. 

Exercises 


1. Put together into a single contingency table the data given on page 83 
regarding conformity by taxi drivers and chauffeurs, and apply the x? 
technique to determine whether or not these differ significantly. Since x? 
cannot work with percentages, take each population to be 1,000, and reduce 
entries to numbers. 

2. For the exercises in this chapter where the number of degrees of freedom 
was taken as (k — 1)(r — 1) use instead (n’ — 1) and compare. Compare 


1 Jackson, DUNHAM, «Mathematical Principles in the Theory of Small 
Samples,” Amer. Mi ‘athematical Monthly, Vol. 42, pp. 344-364 (1935). 
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for plausibility the two interpretations appropriate to the different degrees 
of freedom. 

3. In the Journal of Educational Psychology, Vol. 30, page 119, Wood and 
Davis give the following distributions of scores in acquisition and retention 
for a certain unit of work. The first row gives the score values and the last 
two rows give the frequencies for these scores in acquisition and in retention. 


7| 9)11)13)15)17)19)21/23)25/27| Total 


Frequency acquisition «| 4| 3| 4/11) 8)20)11)10) 7/10) 5) 93 
Frequency retention............... 3| 3/11/16)10)19)12) 9/ 7| 3) 1) 94 


a. Wood and Davis apply the chi-square technique to this in order to 
determine whether there is any true difference between retention and 
acquisition, taking the obtained acquisition frequencies as the “expected” 
(theoretical) ones and the retention frequencies as the obtained ones. Perform 
this operation. What would need to be assumed about the constancy of the 
acquisition scores in successive samples when that method is used? How 
reasonable is that assumption? 

b. Work the problem to find x? by the regular method of a contingency 
table. What are now the assumptions? How reasonable are they? 
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CHAPTER XV 
CURVE FITTING \ 
THE PROBLEM 


In many statistical applications one is required to find a 
smooth curve that is well adapted to indicate the general trend 
of the relationship between varying quantities. All with which 
he has to work is a number of plotted points obtained through 
measurement or testing, which points indicate the operation of 
some law of behavior with respect to these variables. The 
choice of the type of curve that would best represent any trend 
depends, of course, upon how we define the term best and upon the 
apparent adaptability of the curve to the data at hand. 

The extent to which the equation of a particular curve is 
descriptive of variation is limited by errors of sampling and 
insufficiency of data. Whenever experiment shows that some 
law of behavior is being obeyed, we must first assume, therefore, 
that it is expressible to a certain degree of approximation in a 
mathematical formula. We are then at liberty to set up a 
definition of best fit, and to proceed to derive the equation of the 
curve which best indicates the trend of our data. This task is 
known as curve fitting. 


TYPES OF CURVES 

There is a great number of curves which different distribu- 
tions seem to follow. Experience has shown, however, that the 
vast majority of data tend to follow a few types with remarkable 
frequency. The straight line, the parabola, the exponential 
curves of growth and decay, the Gompertz curve, the normal- 
probability curve, and the normal ogive are among those most 
frequently encountered. Their equations are given herewith: 


(A) y=me+b (straight line) 
(B) y =ar tbr +e (parabola) 
(O) y = bet" (organic growth) 
(D) y= be (organic decay) 
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ass tar) (normal ogive 
(E) ETF Niner gona oe 
(F) y= ki" (Gompertz curve) 
@ y= 1 ress (normal-probability 


a \/ curve) 


Other equations of the second degree and higher, further 
trigonometric equations, parabolas of the nth degree, and many 
others are perhaps equally important. We shall, however, limit 
our treatment to those listed above and to a cursory account of 
the Pearson system of curves. The methods of curve fitting 
employed here are applicable to many other curves as well. 


METHODS OF CURVE FITTING 


There are several ways in which the fitting of curves may be 
accomplished. If the plotted points reveal a straight-line trend, 
we may simply place a thin, transparent ruler, or a cord, over 
them and adjust it in a way we consider best. This method, 
known as the graphical method, requires experience and good 
judgment. It should be used in situations where only moderate 
accuracy is required. a 

For straight-line fitting, the method of averages is usually 
superior to the graphical method. The points are considered in 
two groups, and an average point is taken for each group. The 
problem is then simply that of finding the equation of the straight 
line which passes through these two points. This method should 
not be attempted unless there is an unquestionable straight- 
line trend, and the points are evenly scattered throughout. A 
disadvantage of the method of averages is that it does not lead 
to a unique equation, since the groupings are entirely arbitrary. 
It is, however, an easy method and may be used where rapid 
calculations are required. 

Probably’ the most important method of curve fitting is that 
of least squares. The principle of least squares states that the 
curve of a given type which best fits a given set of points is one 
in which the constants of the equation are so chosen as to make 
the sum of the squares of the errors a minimum. These errors 
are the amounts by which the actual ordinates of the points fail 
to agree with the ordinates of points on the curve. Thus, if y 
be the ordinate of one of the plotted points, and 7 be the ordinate 
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of a corresponding point on the theoretical curve, we are to make 
X(y — 7)? a minimum.» The usual methods of the differential 
calculus are employed for this purpose and, as we shall see later, 
lead to a set of equations known as normal equations. By solving 
these normal equations, we find the values of the constants 
appearing in the type equation which we assumed to indicate 
the trend of the points. The method of least squares is general 
and may be applied to a variety of curves. We shall illustrate 
the least-squares method in the following examples. 


FITTING A STRAIGHT LINE 
Consider the following values of x and y: 


27 
56 | 63 


When these values are plotted on graph paper (Fig. 33), it 
appears at once that there is a general straight-line relationship 
between the two varying quan- y 
tities. Our problemis to deter- ṣọ 
mine the equation of the line 
which best expresses this rela- 45 
tionship, če., the equation of 5 
the line of best fit. È 

For convenience let us label 15 
the 11 paired items given above 
as (yi), (zay), (ays), ete. 9% 7 10 1115) 20 5 2 igo ia 
Some of these points will fall ares 
above the line we seek, and some will fall below it. In other words, 
some of the errors will be positive, and some will be negative. 
But since we are to make the sum of the squares of the errors a 
minimum, we need not be concerned with the negative signs. 
Since the ordinate value of any point on the line is expressed by 
the straight-line equation 7 = av + b, these errors—differences 
between the actual and the theoretical ordinate values—may be 
written as follows: 

yy — 9s F Yar (ax: +b) 
Yo — a = yz — (ate +b) 
ys — Js = ys — (axs +b) 
ete. 
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According to the least-squares principle we must minimize 
2(yi — 72, where the summation is to extend from 1 to the 
number of plotted points. In this summation we may replace 7 
by its equal (az + 6). Then the quantity to be made a minimum 
becomes =[y — (ax + b)]?, in which the summation is to extend 
over the total number of points. j 

We learn in the differential calculus that in order to make the 
above quantity a minimum or a maximum, we must have both 
the derivative with respect to a and that with respect to b equal 
to zero. We need, therefore, only to form these derivatives, 
equate them to zero, and solve the resulting equations for a and 
b!. Squaring the quantity within brackets, we obtain 


Ily? + ax? +b? — 2ary — 2yb + 2abx) 


Summing termwise and removing from under the summation 
symbol the a and the b which are independent of the summation, 
this expression becomes 


Iy? + r? + nb? — 2adry — 2b2y + 2Zabrx 


Performing the differentiation with respect to a and then with 
respect to b and at the same time equating the results to zero, 
we have,? 


2aza? — 22ary + 2b22 = 0 
2azx — 22y + 2nb = 0 


Transposing certain terms and dividing each equation through 
by 2, 


(H) az? + b2x = Tay 
ax + bn = Ly 


In Eq. (H) all the summations are known quantities. We 
have, therefore, two simultaneous equations in the unknowns a 
and b. Solving the simultaneous equations by the usual meth- 
ods, we find 


a= P22 — Iz: 2y 
naz? — (za)? 


f 1 It may be shown by further mathematical treatment that in the present 
instance the condition for a minimum, and not a maximum, is satisfied. 

2 Observe that when differentiation is performed with respect to a 
particular letter, the others are treated as constants. 


(221) 
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Ir? Dy — De: Vary 


oe nda? — (2r)? 


(221a) 

Equations (221) and (221a) are general formulas which may 

_ always be used in finding the constants a and b. It is obvious 
that the method is perfectly general and holds for any number of 
plotted points. All that the worker need do is to compute the 
indicated sums and substitute into these formulas. 

The reader will note the similarity between the form of the 
expression for a and that of the coefficient of correlation r, 
(page 99). It can be seen that if the variabilities in the case 
of an r are made equal (i.e., if or = oy), the formula for r reduces 
to the formula fora. This tells us that a coefficient of correlation 
is the slope of the line of best fit when the variabilities are made 
equal. 

For the example with which we began this development, the 
sums which extend over the 11 items are as follows: 


Dry = 7,374; Zz = 165; Zy = 366; 22? = 3,465; (n = 11) 


Substituting into Eqs. (221) and (221a) and performing the 
indicated arithmetical operations, we find a = 1.90; b = 4.73. 

The equation of the line of best fit, obtained by the method 
of least squares, is, therefore, 


y = 1.902 + 4.73 


THE PARABOLA 


If the plotted points indicate a parabolic trend, we may find 
by the least-squares method the equation of the parabola of 
best fit. Consider, for example, the following 12 values of 
x and y. 


These points when plotted (see Fig. 34) reveal an unmistaka- 
bly parabolic trend. Let us assume, therefore, that the desired 
curve is of the form y = az? +br +c. The constants a, b, 
and c are to be determined from the data by the method of 
least squares, 
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As in the case of the straight line, we shall develop the formulas 
for the general case and then arrive at particular values by 
substitution. 


Y Rr N 

50 Our problem is to minimize 
40 (yi — 7)? where yi stands for 
30 the ordinate of any of the actual 
20 points, and ğ; the theoretical 
0 ordinate. But since we are as- 


suming that the theoretical value 
2% of g is “given by the expression 
g = ax? + be +c, we may say 
that we must minimize Ily — (aa? + ba + c)]*. 
Dy — ar? — be — c)? = Ly? + Tara! + Ibr? + Ze? 
— 2Dyax? — 2Bybx — 2Dyc ' 
+ 2dacx? + 22bex + 2Zabzx* 
Differentiating in turn with respect to a, b, and c and setting the 
derivatives equal to zero, we obtain 
l 22azi — 22yr? + 22br’ + 2Xcr? = 
(1) 23br? — 22yr + 22ar + Wer = 0 
22c — 22y + 22ar? + 22bx = 0 
We may take the constants a, b, and ¢ outside the summation 
symbols. Furthermore, Ze = nc. Making these changes, divid- 
ing through each equation by 2, and at the same time rearranging 
the terms by transposition, Eqs. (J) may be written 


OM Naa Ciba lO. 
Fig, 34. 


aXz! + bEz + cr? = Tay 
(WJ) ads’ + bra? +e Xx = Bry 
ada? + bEx + ne = dy 


Equations (J) are the normal equations for the parabola 
of best fit. We need only to compute the summations for the 
data of our problem and then solve the resulting equations for 
a,b, and c. The sums for the data given above are as follows: 

Ert = 39,974 Zx’ = 4,356 Zr? = 506 2a = 66 

Dy = 198 Dry = 1,328 Zr’y = 12,220 (n = 12) 
Our normal equations are, therefore, 


39,9740 + 4,356) + 506c = 12,220 
4356a + 506b + 66c = 1,328 
506a + 66b+ 12c= 198 


\ 


CURVE FITTING 431 


Upon stlying these equations by the ordinary methods of 
elimination, we find a = 0.94, b = —8.66, c = 24.44. Hence, 
the equation of the parabola of best fit is 


y = 0.942? — 8.66a + 24.44 


We have seen that in the cases of the straight line and the 
parabola the task of obtaining the equation of the curve of 
best fit by the method of least squares is a direct one of minimizing 
errors—differences between actual and theoretical scores. The 
process leads to sets of normal equations which involve certain 
summations as the coefficients of the unknown parameters of 
the selected theoretical curve. Since the method is general and 
the results are unique, Eqs. (221) and (221a) may be regarded 
as formulas for straight-line fitting, and Eqs. (J) may be taken 
as the set of equations which yield the parabolic coefficients. 


CURVES OF GROWTH AND DECAY 


The problem of fitting curves of growth to a given set of data, 
is, in general, more complicated than that of fitting straight lines 
or parabolas. Especially complicated is the situation in which 
there is a tendency toward saturation or maturity in the later 
stages of thedevelopment ofthe y 
growth factors. 300 

In the case of organic growth 

under ideal conditions, where 200 
the increments of growth are 
continuously accumulating, we 
may simplify our task by first 
transforming the assumed equa- 
tion into another form and then tee i ae ek EONA 
working with this latter form. Piacon: 
We simply take the logarithm of each member of the assumed 
equation. This puts the growth curve in the form of a straight 
line. Consider, for example, the following seven values of 
x and y. 


100 


These data reveal a trend of growth of a certain nature (Fig. 
35). There appears no tendency toward saturation or maturity 
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as we proceed with increasing values of the time factor z. We 
may assume, therefore, that the variation in growth can well 
be approximated by Eq. (C), y = bet, The mathematical 
justification for this conclusion rests, as we shall see later, upon 
the straight-line trend of the log values obtained by taking 
the logarithm of the y values of the above list. _ We shall rewrite 
our list of paired values and include these log y values. 


3 4 5 6 
40 90 150 300 
1.60 | 1.95 | 2.18 | 2.48 


The log y values are plotted against the x values in Fig. 36. 
Here we observe a straight-line trend of log values. Now let 
us study the nature of our assumed type equation. We have 


Ona! 2. OEE IOUS: 
Fia, 36. 


y = bet. Take the logarithm (base 10) of each member; 
then log y = log be*#, Since the log of a product equals the sum 
of the logs of the individual factors, we may write 


log y = log e+? + log b 


By the exponential law of logarithms, the first term on the right 
may be written az log e. Hence, our equation becomes 


log y = az log e + log b 
Now log e (base 10) equals 0.4343. Therefore, 
log y = 0.4848ax + log b 
This equation is a straight-line form if we let Y = log y, 
A = 0.4343a, 


B = log b. That is, our equation is of the form Y = Ax + B. 
Moreover, our problem has been reduced to that of finding 
the straight line of best fit for our data when expressed in terms 
of log units which vary with the original v units. In addition, 
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we have justified the selection of the type (C) curve to fit the 
data. 

To complete the problem, we have only to make use of the 
formulas (221) and (221a)—using Y (log y) in place of the y 
appearing therein. The summations required are as follows: 

IrY = 42.14, Zz? = 91, DY = 11.69, 2x = 2l, n = 7 
Substituting these values into Eqs. (221) and (221a), we find 
_ 7(42.14) — (21) (11.69) = 0.25 
7(91) — (21)? 
_ (91)(11.69) — (21) (42.14) _ 0.91 
7(91) — (21)? j 
One form of the equation of best fit is, therefore, 
Y = 0.252 + 0.91 
or, log y = 0.254 + 0.91. This latter form may be expressed 
in exponential form in either one of two ways. Since log y 
is taken to the base 10, we may write as a consequence of the 
definition of a logarithm, y = 10°*7#9*1, Rearrange the right- 
hand member by the law of exponents for multiplication. 
Yy = 100-257 . 100-91 
Now it is easily verified that 10°°! = 8.1. Hence, 
(K) y = 8.1 - 1002 

Equation (K) is satisfactory as a final working form of the 
equation which best fits the data of our last list. The reader 
will observe that in place of e raised to the variable power, we 
have during the process changed to the number 10. This, of 
course, is not necessary; it may readily be put again in terms of 


e, as follows: 
Let 10-2 = e'z, in which we wish to determine the value of c. 


Take logarithms (base 10) of both sides. 
log 10° = log e° 
Since the log of a quantity to an exponent equals the exponent 
times the log of the quantity, 
0.252 log 10 = cx log e 


Now make use of the facts that log 10 (base 10) = 1, and that 
log e (base 10) = 0.4343. Then 0.25¢ = 0.4343cx. 


A 


B 
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Divide through by 0.43432. c = 0.57. Hence 
100-252 = ¢0-572 
Substitute into (K), and we obtain 
(L) y = 81e" 


Equations (K) and (L) are entirely equivalent. Either of 
them may be regarded as that equation which is expressive 
of the growth variation indicated by the data with which we are 
working. A curve of this type is known as a curve of organic 
growth. It is frequently encountered whenever the conditions 
are neatly ideal and whenever there is no tendency toward 
saturation or maturity during the time measurements are being 
made. 

Equation (D), in which the exponent is negative, is known as 
the curve of organic decay. It may be regarded as a growth 
curve in which the increments are continuously falling off as 
the variable z increases. A graph of a theoretical curve is given 
in Fig. 37. The method of fitting the curve of organic decay 
y is the same as that of fitting the 

curve of organic growth. In 

either case the worker should 

be reasonably certain that his 

assumed curve is representative 

of the data at hand. The test 

0 x of selection lies in taking loga- 

Fra, 87 rithms of the y scores (growth 

measurements). If these log values, when plotted against the 

x scores (units of time), indicate a straight-line trend, the assumed 

curve may be considered predictive of the growth of the factor 
under consideration. ; 

The problem of growth curves in general has received consider- 
able attention in recent years. Many empirical formulas have 
been developed, and many theories concerning the nature of 
growth have been developed. Pearl has shown that the curve 
of population growth (an S shaped logistic curve) is applicable 
to many forms of growth. He has also given evidence that the 

1 Peart, Raymon, The Biology of Populati n Growth, Alfred A. Knopf, 


Ine., 1925; also Studies in Human Biology, Williams & Wilkins Company, 
1924, Part IV. 
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same generalized curve deals excellently with unsymmetrical 
as well as with symmetrical growth.’ Bass-Becking defines 
growth in terms of cell growth. In certain forms of develop- 
ment he finds growth best expressed as the differential quotient 
of cell volume increase and time interval. He has pointed out 
the interesting fact that whenever small cells grow at the same 
rate as large cells, a normal distribution of cells reappears after a 
finite number of growth periods and cell-division periods. This 
indicates the applicability of the normal ogive curve (the curve 
obtained by summing the ordinates of a normal curve) in certain 
situations. Peters believes that the curve of growth in ideational 
learning may be the ogive.? He gives both theoretical and 
empirical reasons for his hypothesis. L. L. Thurstone has found 
that the hyperbola seems best to fit the norms of 40 tests, and 
claims this type to be the curve of learning.* As Peters has 
pointed out, the hyperbolic trend is due to the fact that the 
material used by Thurstone involves only the upper levels of 
growth and consequently does not take into account the possible 
influence of the early learning increments upon the theoretical 5 
shape. 
THE NORMAL OGIVE CURVE 


Type (E), which is shown at the beginning of this chapter, is 
the mathematical expression of the theoretical normal ogive. 
As the equation indicates, it is the curve resulting from integrat- 
ing the normal-curve function. Empirically, this amounts to 
summating the ordinate values (z scores) of the normal distribu- 
tion of data we have at hand. Figure 38 shows the normal ogive 
in relation to average scores from the Peters General Information 
Test. 


THE GOMPERTZ CURVE 


Perhaps one of the curves most applicable in biological and 
psychological research js the well-known Gompertz curve.* 


1Ppart, Ra, and L. J. REED, “Skew Growth Curves,” Proc. Nat. Acad. 
Sci., Vol. 11, pp. 16-22 (1925). 

2 Perers, C. C., Foundations of Educational Sociology, rev. ed., 1930, The 
Macmillan Company, pp. 452-456. 

’THurstonn, L. L., “The Learning Curve Equation,” Psychological 
Monographs, Vol. 26, No. 3 (1919, No. 114). 

4Gompnrz, B., “On the Nature of the Function Expressive of Human 
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S. A. Courtis has begun with this formula as a basis, type (F), 
and has attempted to show a certain universality underlying 
all biological growth.! A feature of the work of Courtis consists 
in transforming the formula developed by Gompertz into a 
straight-line form by twice taking logarithms of each member of 
the original equation. He has shown that under standard con- 
ditions log log values of the percentages of development are 
directly proportional to the times in which changes in develop- 
ment occur. This means that the relationship between time and 
log log values may be represented by a straight line. In other 
words, the percentage of development increases to equal powers 
of itself in equal periods of time. 


Fic. 38. 


In order that the reader may become acquainted with the 
growth measurement methods of Courtis, we shall give a brief 
discussion of the mathematical theory underlying his work and 
follow this with an illustration.’ 

The equation of the Gompertz curve is 


(M) y = ki" 


in which y represents measure of growth at the time t, k the 
value at maturity, 7 the initial development, and r the rate of 
growth. If we take k to be 1 and write our equation 


(N) y=i" 


Mortality, and on a New Mode of Determining the Value of Life Contin- 
gencies,” Trans. Roy. Soc. (London), Vol. 115, pp. 513-585, 1825. 

+ Courris, 8. A., The Measurement of Growth, Brumfield and Brumfield, 
1932, 

* For a full treatment of the mathematical properties of the Gompertz 
curve, see S. A. Courtis, op. cit. 
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our growth is expressed in terms of percentage of development. 
Equation (N) is known as the simplex growth curve. Its fitting 
to growth data depends upon the following development: 

Take the logarithm of each member of (N). 


(0) log y = r' log 7 
Take the logarithm of each member of (0). 
(P) log log y = t log r + log log i 
Equation (P) is the equation of a straight line, 
(Q) Y=At+B 
in which 
(R) Y = log log y; A = log r; B = log log? 


It is evident from the foregoing discussion that if the log log 
values of a given set of percentages of development indicate a 
straight-line trend, our problem is simply that of finding the 
constants A and B, and then returning to the original Gompertz 
curve (M). A is the slope of the line of the log log values, and B 
is the initial log log value, t.e., the value when ¢ = 0. 

The two constants, A and B, may be found by the method of 
least squares whenever a fairly full set of values is at hand, t.e., 
whenever we know the log logs of the percentages of development 
for units of time from incipiency to maturity. However, since 
we are assuming that the percentages of development increase in 
equal powers in equal periods of time, we may take A as the 
mean increase in the log logs from time interval to time interval 
over a known range of time. The initial point B may usually be 
determined from the display of the data. This latter method of 
finding A and B is most conveniently employed whenever but 
two or three points on the growth curve are reliably known and 
we wish to interpolate for the others, a problem of predicting the 
percentages of development for the entire growth period on 
the basis of a few known percentages, rather than of obtaining the 
formula expressive of the nature of growth of the organism! when 
measurements have been taken throughout its entire existence. 

Let us now apply our methods to the fitting of the data listed 
below. The data are percentages of boys passing No. 20, 


1 The word organism is here used in the group sense. It is the totality of 
elements which compose the distribution of the growth data. 
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Fingers, of the Binet Test.* These percentages are distributed 
according to the chronological ages of the boys, and our problem 
is to find the equation of the curve which expresses the growth in 
their ability to pass the test. 


Purcenraces or Boys Passine THE Biner Test, No. 20, FINGERS, AT 
DIFFERENT AGE LEVELS 


Age in years......--.+s0srse tete 3.5 
Per cent passing........+..+0+55 0.0 


Before going further we must make sure that we understand 
how to use log logs. Let us see how these values are found. 
Log 28.4 per cent = log .284 = (9.45332 — 10) (from tables). 
This may be written log .284 = —.54668. Now the log of a 
negative number does not exist. We, therefore, define the log 
of this negative quantity to be the log of the product of this 
quantity and —1. This amounts to disregarding the minus 
sign appearing before .54668. Hence 


log log .284 = log .54668 = (9.73773 — 10) 


(from tables). That is, log log .284 = —26227. The follow- 
ing may be taken as a working form when finding the log logs of 
decimals: 


log 625 = (9.79588 — 10) = —.20412 
log log .625 = log .20412 = (9.30988 — 10) = —.69012 


The reader should prove that he understands the process of 
using log logs by obtaining — 1.18376, —1.67965, — 2.21325, and 
—2.51570 as the log logs of .860, .953, .986, and .993, respectively. 

Now let us examine the log logs of the limits of our range of 
percentage values. We have log .000 = —«. Therefore, 
log log .000 = log œ = œ. Also, we have log 1.000 = .00000. 
Hence, log log 1.000 = log .00000 = — œ. This tells us that 
we are unable actually to reach the true points of initial develop- 
ment and maturity through the mathematical machinery of the 
Gompertz curve. We may, however, make our errors in these 
respects as small as we wish if our measurements are so fine that 
we may approach the limits of the percentage range sufficiently 
far. It is for these theoretical reasons that we must be content 


1 Burt, C., Mental and Scholastic Tesis, P. S. King & Son, Ltd., London, 
1922. 


CURVE FITTING ; 439 


with an arbitrary point of incipiency, and consider maturity as 
100 per cent development as measured by some natural standard. 
We are now in a position to return to the illustration of per- 
centage of boys passing the Binet test, just quoted. Choose 
as our starting point the age 4.5 years. This is the first year (or 
period in our time scale) that any growth has been recorded. ` 
At this point we may consider t equal to zero, and label the 


‘remaining periods 1, 2, 3, ete. The log logs corresponding to 


these time units are given above and are plotted in Fig. 39. 


ay, 
3.0000 


2.0000 


1.0000 


0 l 2 3 4 5 Goa 
Fia. 39. N 


It is evident from the graph that the relationship between 
time units and log log percentage values is linear in trend. We 
may find the equation of the line which best expresses this rela- 
tionship by the least-squares method. To do this, we have 
merely to apply formulas (221) and (221a) of this chapter in 
order to compute the required constants A and B. Observing 
that we are using t instead of x and Y instead of y, we have as 
formulas 


nStY — 2Y „ _ SBY — 2Y 
A= as not? — (at)? 


The necessary sums to be substituted into these formulas are, 
of course, obtained from our log log data on page 438. They 
are (as may readily be verified) as follows: 


DY = —29.52809, St = 15.00, 2Y = —8.5475 
>t? = 55.00, n = 6 


When these values are substituted into the above formulas 
and simplifications are made, we find that A = —.4666, and 
B = —.2575. 
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Therefore, the equation of our growth curve expressed in 
log logs is 


(8) Y = —.46661 — .2575 


To get back to the original form of the Gompertz curve (N), we 
must now return to Eq. (R), the point at which our departure 
to the straight-line form began. We have, from these equations 
and the results just obtained, that A = log r = —.4666. 

Hence, to obtain r, we must find the antilog from the tables, as 
follows: —.4666 = 9.5334 — 10. From the tables we find that 


r= .342. Again, we have B = log log i = —.2575. To obtain 
i, we must then twice take antilogs. This is done as follows: 
—.2575 = 9.7425 — 10. Thus logi = —.5529, which is the 


antilog of 9.7425 — 10 with a minus sign inserted. (Remember 
that when we found log logs of decimals, we disregarded the 
minus sign. Thus when taking antilogs, we must insert it again.) 
We shall find 7 by taking the antilog of —.5529. We see that 
—.5529 = 9.4471 — 10. From the tables we find that 7 = .28. 

Now the type equation that we assumed to express best the 
growth in the ability of boys to pass the Binet test is y = ù“. 
Hence, our final equation becomes upon substituting the values 
of r and i, 


(T) y = 2834 


Equation (T) is the curve of growth which best fits the data 
about growth of boys in passing the Binet fingers test. It may be 
looked upon as a formula for estimating percentages of develop- 
ment of boys in their ability to pass No. 20, Fingers, of the 
Binet Test. In practice, however, it may be found more con- 
venient to substitute values of ¢ into Eq. (S) and then convert 
these into percentage values by twice taking antilogs. 

Courtis’ methods of measuring growth should have wide 
application in the field of education as well as in many other 
branches of science. He has developed tables of isochrons which 
are the percentages of total time to reach maturation that corre- 
spond to percentages of development. The construction of these 
tables necessitated, of course, the selection of an arbitrary range 
of log log values; że., it was necessary to fix upon points of initial 


1 These tables may be obtained from the Courtis Standard Tests, 1807 E. 
Grand Boulevard, Detroit, Mich. 
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development and maturity. These limitations do not, however, 
seriously affect the practical values of the isochronic system in 
many growth situations, for it has been found that a wide variety 
of growth data agrees very closely in these respects. They do, 
however, limit theoretical generalizations concerning the nature 
of all biologic growth. In addition, it will be remembered that 
the Gompertz formula is expressive of the nature of growth when, 
and only when, the log log values of the percentages of develop- 
ment are linear in trend. If these log log values obey some 
other law, then it follows that the growth involved obeys some 
other law. 

In view of the minor limitations cited above, the Gompertz 
curve can well be employed in many growth situations. If the 
reader is unfamiliar with the use of logarithms, he may, of course, 
resort to the isochronic tables for ease of computations. The 
writers believe that for the beginner, at least,a fuller understand- 
ing of the basic principles underlying the study of growth curves 
will result from a clearer view of the mathematics involved. 


TESTING GOODNESS OF FIT 
Provided the variates can be grouped into classes so that some 
estimate may be made of the population variance of the constitu- 
ent classes, a mathematical test of goodness of fit of our curves 
can be made. As shown on page 328, the formula is 


Fre RI 
5 tes i= R 


where R? is the variance of the values computed for the regres- 
sion line—when frequencies are considered—divided by the total 
variance. The reader should refer to our earlier treatment 
and make this test for such tables as our Tables XXVII and 
XXVIII. 


THE PEARSONIAN SYSTEM OF CURVES 


A set of frequency curves representing a wide variety of statis- 
tical distributions has been developed by Pearson.? The equa- 


1 There is, of course, no limit to the variation of curves that may give rise 


to different log log curves. 
2 Prarson, Kart: “Mathematical Contributions to the Theory of Evolu- 


tion,” Trans. Roy. Soc. (London), Series A, Vol. 186, pp- 343—414 (1895); 
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tions of these curves are found by integrating a certain differential 
equation having much support in the theory of probability and 
satisfying certain geometrical properties characteristic of uni- 
modal frequency distributions. They are useful in fitting many 
symmetrical, skewed, J shaped, and other trends of data, and 
even include the normal curve as a special case. 

The differential equation giving rise to the Pearsonian system 
of curves is 

dy _ _(m— y 
W) TEET 
in which m, a, b, and ¢ are constants, and ¢ = z/s. 

In fitting a Pearsonian curve to a given set of data, the pro- 
cedure for determining the constants is to express them in terms 
of moments of the system, substituting for these moments those 
calculated from the data. The values thus found determine the 
particular differential equation to be integrated to obtain the 
equation of the curve to be fitted to the data at hand. Kenney 
lists formulas for m, a, b, and c, all of which are given in terms of 
moments. ! 

After the form of the differential equation has been found, 
integration yields the equation of the type of curve to be fitted 
to the given data. The constant resulting from integration 
may then be found by using the fact that the area under the 
curve is N, the total frequency of the distribution. The equation 
thus arrived at is that of the curve fitted to the data at hand. > 

We shall now examine a few types of curves of the Pearsonian 
system, indicating the form of the differential equations giving 
rise to these types and pointing out how one might proceed to use 
the equations in curve fitting. 

Type VII.—Suppose a given set of data yields the information 
that the quantities m, b, and c equal zero and that the constant a 
equals unity. Then the differential Eq. (U) becomes 


dys 
(V) ae ty 


“Supplement to a Memoir on Skew Variation,” Vol. 197, pp. 443—456 (1901); 
“Second Supplement to a Memoir on Skew Variation,” Vol. 216, pp. 429- 
457 (1916). 

1 Kenney, Joun F., Mathematics of Statistics, D. Van Nostrand Com- 
pany, Ine., 1939, Part II, p. 48. 
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Integrating, we find 


a 


y = Ae ? 
in which A is a constant to be determined from the data. We 
recognize this equation to be that of the y 


normal curve and have seen elsewhere 
that A = N/V2r. The normal curve 
is known as the type VII curve of the 
Pearsonian system. o 5 

Type III.—If the moments of our data Fia. 40.—Type VII curve. 
are such that c equals zero and m, a, and b are not equal to zero, 
(U) becomes 


; dy _ (m — Dy 
r) a a+b 
Integration yields an equation of the form 
y = ACK +e 
in which A is a constant to be determined from the condition 


that the area under the curve equals N, and K is found from the 


moments. 
In the special case K* = 1, type TII takes the form Type X, 


y = Aet 
which is also known as Laplace’s first frequency curve. 
Y IE 
0 ti 0 a 
Fic. 41.—Type III curve. Fic. 42.—Type X curve. 


The determination of the constants in the Pearsonian fre- 
quency curyes is generally very laborious, involving the computa- 
tion of the first four moments and the area under the curve. 
Examples of the representation of actual frequency distributions 
by several types may be found in A First Course in Statistics by 
D. C. Jones, and a rather complete account of all the curves in 
the Pearson system has been set forth by C. C. Craig. Figures 
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of 12 types resulting from different values of the constants are 
shown in H. L. Rietz’s ‘Mathematical Statistics” (Carus 


Monograph), Chap. III. 
For further discussions of the Pearson system of curves the 
student is referred to Pearson! and Hlderton,? 


Exercises 


1. Fit a second degree parabola to the data of Table XXVIII, page 317, 
by the method of least squares, and test the goodness of fit. How do your 
results compare with those we give on page 329? 

2. Fit a third degree parabola to the data of Table XXVIII, page 316, and 
test the goodness of fit. What other types of curves might fit these data 
better? 
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CHAPTER XVI 
THE TECHNIQUE OF CONTROLLED EXPERIMENTATION 


To experiment is to control the behavior of animate or inani- 
mate objects while we observe outcomes. Thus the physicist 
has a ball roll down an inclined plane which he adjusts in a manner 
to suit his purposes while he systematically observes the velocity 
or the distance traversed by the moving object, the agriculturalist 
treats soils in some predetermined way while he takes measured 
stock of the effects produced by the factors he is manipulating, 
or the chemist brings about certain changes in temperature and 
determines results. In essentially the same manner an educa- 
tionalist applies to groups of pupils two or more different methods 
of teaching and objectively ascertains the relative degrees of 
success from each in contributing toward the attainment of 
certain specified objectives, or the economist-statesman tries 
municipal ownership of utilities in certain cities and measures 
success over against the success of such utilities under private 
ownership in similar cities. But, while it is characteristic of 
experimentation in the strict sense to have purposive manipula- 
tion of the factors involved in the experiment so as to make them 
contribute with maximum clearness and economy toward the 
answers to the specific questions we want answered, it is some- 
times feasible to find in nature ongoings that so nearly conform 
to what we want that we may utilize them without further 
manipulation. Here selection may replace control. Such 
avoidance of the necessity for artificial manipulation is particu- 
larly convenient to a social scientist. The physicist, the chemist, 
the agriculturalist are permitted to operate on their materials 
at will—to lathe down a cylinder until it suits a particular 
purpose, to heat a solution to any required temperature, to 
treat the soil in any manner desired; but a sociologist studying 
adult societies is not at such liberty to manipulate groups of 
people for the sole purpose of his experiment. Neither does the 
economist nor the political scientist have such privilege excepts 
perhaps, on rare occasions. The educationalist dealing with 
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school children is able within limits to set up his environment to 
suit the needs of his research, but even he can often control his 
factors only in part and sometimes not at all. Under such 
limitations the sociologist may seek two sets of people who 
happen to differ from each other in the particular respect he 
wishes to investigate while being alike in all other essential ways 
and may use this contrast as a setting in which to study the 
effect of the differentiating factor. In the same manner the 
economist, the political scientist, the educationalist, or even 
the biologist, may take advantage of contrasts that the normal 
progress of events rather than his own manipulation has set up. 
Or the research worker may reconstruct such contrasts from 
records, thus having a sort of retroactive experiment. Com- 
parisons which depend thus upon selection rather than upon 
control proceed under a heavy handicap, but they involve the 
same fundamental principles and call for the same statistical 
techniques as true experiments. We shall, therefore, have them 
in mind as well as controlled experiments in our discussion 
throughout this chapter. * 

In this chapter we are concerned chiefly with experiments 
involving growths of the sort in which students of education 
and the social and biological sciences are interested—growths 
not only in biological organism but in ideals, in skills, in informa- 
tions, in folkways, as well. Where growths are normally in 
progress, an experimental factor can have only the effect of 
changing the rate of growth. In consequence, we must always 
measure the outcome in our experimental group against a control 
situation. When an experiment has been conducted without 
such control we do not know how much of the growth to attribute 
to the factor under special study and how much to other factors. 
Normally this control situation consists of a parallel group of 
individuals, or a parallel plot of ground, or what not, precisely 
like the experimental one in all pertinent factors except the 
experimental one. For the sake of maximum contrast it is best 
to have this differential factor present in as large degree as possi- 
ble in the experimental group or plot and absent from the control 
situation; but, if that is not feasible, the two situations may 
differ by having the experimental factor present in large degree 
in the one situation and in small degree in the other 
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It seems scarcely necessary for us to say here that care must 
be exercised to keep all other conditions constant in the two 
situations except the one experimental factor; this is the law 
of the single variable, so fundamental to all scientific experi- 
mentation. But to say that the variable must be single is not to 
say that it must be atomistic—that it must be simple in the 
sense that it cannot be analyzed into components. Many 
absurd blunders have been made in educational experimentation, 
at least, by overworking the principle of a single variable in the 
sense of a simple variable. In many of the experiments on 
homogeneous grouping of pupils, for example, effort was made to 
keep the type of instruction the same on both sides, the same 
textbooks and the same subject matter—to have, in short, all 
procedures exactly alike except that instructional groups were 
of small range of talent in the experimental groups and of great 
range in the control groups. But the whole purpose of homo- 
geneous grouping, as employed normally in school, is to permit 
differentiation of instructional materials and procedures and 
adaptation of them to the differing levels of ability. When this 
part of the technique of dealing with homogeneous groups was 
discarded what was left was a mere abstraction having nothing 
to do with real educational alternatives. In useful experiment- 
ing we: must set one practical Gestalt against another practical 
Gestalt; we must contrast one teaching procedure together with 
all the characteristics that normally accompany it with another 
procedure accompanied by all buttressing elements that would 
normally be used with it; or we must make an experimental 
contrast between one economic system together with all the 
ethical and legal and other drives that go with it to make it 
successful and another economic system accompanied also by the 
buttressing conditions essential to its integrity. In situations 
like this it is true, as the Gestalt psychologists have been pointing 
out, that the whole is more than the sum of its parts. In addition 
to the choice between major alternatives it is proper, of course, to 
make experimental comparison among specific variations within 
any one of the alternatives in order to find the effect of each 
constituent element and the optimum combination of these 
constituents. But, however the variable is defined that is to 
be our experimental factor, it must be “single” in the sense that 
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it must constitute the only essential difference between our two 
situations. 


MATCHING GROUPS 


One of the most important factors to keep equal between the 
two sides is capacity to respond to the stimulus in which the 
experimental factor consists. In the case of learning experi- 
ments, that means capacity to learn materials in question on 
the part of the individuals who constitute the groups. Doubtless 
in experiments in sociology, in biology, in agriculture, and in other 
fields the capacity to respond lies also in constituent parts of the 
groups or of the plots experimented upon, but we shall direct 
our illustrations chiefly to the sort of situation typified by a 
learning experiment. In order that two groups may be optimally 
matched the mean learning ability should be the same in both 
groups and also the distribution of abilities should be of the 
same shape. It is possible to achieve this by manipulating the 
membership of the groups until the mean scores for capacity 
to learn are the same on both sides, and the standard deviations, 
and perhaps the indices of skewness and kurtosis, are alike. This 
is a perfectly legitimate way. But these ends can usually be 
achieved much more surely and easily by matching individuals 
in pairs. We have on our list, let us say, a student, A, in the 
experimental group with a certain learning score, and we seek 
for him a mate, A’, in the control group who has the same 
score for capacity toimprove. Then we take in the experimental 
group a second person, B, and seek a mate, B’. Thus we con- 
tinue making pairs until we have constructed all that the per- 
sonnel of our two groups permits. For a reason which we shall 
discuss later it is desirable to select and to list these in descend- 
ing order of learning ability, as indicated by the matching scores. 
When groups are matched by this individual-pair method, it 
is automatically provided that the means of the capacity scores 
shall be the same for the two groups and that the shape of both 
distributions of abilities shall be alike. We shall soon see, too, 
that a number of collateral advantages accrue in the interpreta- 
tions of our results. We need not insist upon precisely the 
same scores for the mates, because our measuring instruments 
are so far from perfectly valid that we cannot take seriously 
discrepancies of a few points. Differences as great as 5 of 
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10 per cent of the range are not too much, provided they are 
so balanced between the two sides as to keep the means practically 
the same. It is usually necessary to drop some members from 
one or both sides because they cannot be matched, but the 
number should seldom exceed 10 or 12 per cent unless the groups 
as originally constituted differ markedly in average ability. 
These unmatched individuals may remain with their groups, 
but none of their scores are to be counted in the bookkeeping 
of the experiment. Insistence upon too great precision in match- 
ing is likely to reduce the number of pairs so as to lower reliability 
unnecessarily, while too crude matching results in groups that 
do not sufficiently closely parallel each other in the distribution 
of abilities. If one group is much larger than the other, it would 
be feasible to have several mates in the large group for each 
member of the small group, but the same number of mates for 
each individual. * 

On what criterion shall we match our groups? Any criterion 
is good that is likely to correlate highly with improvement in the 
function under experimental study; if scores on a criterion do not 
correlate well above zero with improvement in the function 
studied, that criterion is useless for purposes of matching. Scores 
on an intelligence test are frequently used as a basis for matching 
in educational experiments. Intelligence test scores correlate 
only fairly highly with most of the growths with which we are 
concerned and, in consequence, do only moderately well. But 
in many situations we have nothing better. Usually scores of 
previous academic achievement are more highly predictive of 
success, especially in the same field, than intelligence test scores 
are; hence they make a better basis for matching. For some 
types of experiments the social-economic status of the home 
makes a valuable basis for matching. We can get a safer basis 
for matching by combining several criteria than from a single 
one. The ideal procedure is to pair simultaneously on all of these 
criteria, particularly if they are such as to correlate low with one 
another but each promising a rather high correlation with 
improvement in the trait studied. Also pairs are much harder 
to make than on a single criterion, unless the number of indi- 
viduals to be drawn upon is very large. We are therefore often 


1A method of ‘achieving the effect of matching groups without loss of 
population is discussed later in this chapter, p. 463. 


450 STATISTICAL PROCEDURES 


forced to the policy of taking an average of the scores from several 
factors. If these are highly intercorrelated, this averaging 
of several gives us a more reliable measure of capacity just as a 
longer single test would do; but if the intercorrelations are low, 
this averaging becomes abortive, since it makes toward equal 
composite scores for all the individuals. 

We suggest the following alternatives in regard to matching: 

1. If the function to be learned is new to the individuals, 
so that no measures of previous attainment are feasible, and if 
no other measures that are known to correlate more highly with 
the function are in sight, match on the basis of one or more 
intelligence tests. 

2. If the persons are somewhat along on the curve of learning, 
match on the basis of good objective measures of present status 
in the function to be experimented upon. Use at least some 
of the same tests, or forms of the same tests, that are to be 
employed at the end of the experiment. We suggest this for 
several reasons: (a) attainment to date is likely to be highly 
predictive of learning ability in the trait considered; (b) matching 
on the basis of initial attainment places the two mates at about 
the same position on the learning curve, and position on the 
learning curve at the beginning of the race has much to do with 
the prospect of improvement; and (c) matching on initial scores 
with which final scores are to be compared is fairly likely to 
place together mates who have experienced similarly signed errors 
of measurement, particularly if a second criterion is also employed 
as suggested below. 

8. A better basis than any measure of present attainment 
alone is a combination of some measure of present attainment, 
particularly a measure of initial status in the function under 
investigation, and a measure of prospective speed of progress— 
intelligence quotient or educational quotient. These two meas- 
ures should not be averaged but should be used as simultancous 
bases, 

4. If more criteria are to be employed in matching, we suggest 
that they be combined into not more than two or three different 
types by averaging and that these few different types be used as 
simultaneous bases for matching. 

When averaging scores it must be remembered that elements 
in a battery get a weighting in proportion to their variabilities. 
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If, therefore, we intend that the factors shall all have equal 
weight, we must put their scores into forms which have equal 
variabilities. This may be accomplished in several ways: 

1. The scores in each test except one may be multiplied by 
some index that will make all variabilities approximately the 
same. If, for example, we accept A as the basis of one, all scores 
in test B must be multiplied by o4/ox in order to give factor B 
the same weight in the battery as A has. A corresponding 
thing is true of each of the other tests in the combination. If we 
wish to give multiple weight m to factor S, we can do so by multi- 
plying the scores of test S by mou/s instead of o4/os. 

2. We may reduce all scores to z form by dividing the deviation 
of each from the mean of the scores in that function for both 
groups combined by the standard deviation of these combined 
groups (see page 80). Besides being comparable for all sorts 
of measurements, such “standard scores” have some other 
advantages. 

3. We may reduce all sets of scores to a distribution with a 
standard mean and a standard variability. This amounts merely 
to a modification of method 2, 


MEASUREMENT OF OUTCOMES 


Having matched the groups for apparent capacity to respond 
to the experimental factor and having kept all the factors except 
the experimental one constant while time elapsed for the spread 
of the two groups in growth, our next task becomes the measure- 
ment of progress. The least amount of measurement admissible 
is a test of achievement at the end of the experiment. If, how- 
ever, conditions permit, it is highly desirable to have measure- 
ments of progress from time to time within the course of the 
experiment, so that we may be able to compare the two growth 
curves at several points instead of merely at the end. It is, too, 
highly desirable to have one or more delayed measurements in 
order to ascertain how well the differential advantage persists. 

It is particularly desirable that the measurements be thorough. 
If the tests employed have low reliabilities, on account of short- 
ness or other limitation, the obtained differences are smaller 
than the true ones. Equally unfortunate is the practice of 
measuring for only a few, or even only a single one, of the traits 
potentially affected by the experimental factor. When one has 
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gone to the trouble to match groups and to maintain a differential 
in treatment of these groups, it is too bad to stop with less than 
the most nearly complete answer to our question that the situa- 
tion would be able to make. Idcally we should measure with 
respect to every type of outcome that might hypothetically be 
affected by the experimental factor. Gates! gave a good exam- 
ple of comprehensive measurement when, in one of his experi- 
ments, he measured differences in seventeen different traits. 

Later in this chapter we shall return to the discussion of the 
importance of thorough and valid measurements. 

In all careful experimentation the measure employed for con- 
trasting the central tendencies of the groups is the mean, not the 
median. Medians have lower reliabilities than means. Some- 
times differences are taken in terms of proportion, e.g., the 
proportion of pupils elected to student offices from each of the 
groups, or the proportion making honor grades. We may also 
wish to compare the variabilities, since one of the methods may 
make for evenness of attainment on the part of the members of 
the group while the other may make for differences among indi- 
viduals. Furthermore, we may desire to measure certain out- 
comes in terms of coefficients of correlation; particularly we may 
wish to know which sort of treatment brings attainments that 
correlate most highly with the measure of learning ability that 
constituted the basis for matching. We may desire, too, to 
know what proportion of individuals exceeded their mates by 
each method, and whether those who exceeded their mates in 
the experimental group were at the high levels of “intelligence” 
or at the low levels, or scattered at random through the 
distribution. 


RELIABILITY OF DIFFERENCES 


Having found differences, our next task is to consider their 
importance. Are they large enough to claim much attention? 
There are several ways in which we can play up our differences 
so as to give to ourselves and to others some rather concrete 
notion of their degree of importance. 

1. We may put them in terms of percentage. We may say 
that the control group gained an excess of 0.43 points over the 


1 Gares, A. I, “A Modern Systematic versus an Opportunistic Method 
of Teaching,” Teach. Coll. Rec., Vol. 27, pp. 679-700, 
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experimental group which is 5.3 per cent of the size of the mean 
of the latter group; or that the gain by the control group was 
118 per cent of that of the experimental group. 

2. We may put the difference in terms of the period of time 
normally required for making as much progress. Thus we may 
say that the mean of the experimental group exceeded that of 
the control group by an amount equal to 3 months of educational 
age, or by half as much as the normal gain between the freshman 
and the senior year in a typical liberal arts college. 

3. We may put the difference in terms of standard measures. 
This we do by dividing the difference between the means by the 
standard deviation of the scores of the two groups combined. To 
put the difference thus in terms of standard deviations is to give 
to it a meaning that is the same for all sorts of measurements 
and all sorts of situations and is a highly desirable practice. 

4. We may show how far our obtained difference is from a 
chance one, and, consequently, what degree of assurance we may 
have that it will not turn in the opposite direction with further 
sampling. This is the matter of reliability and is dependent 
upon the size of the groups as well as upon the size of the differ- 
ence. Since it does not turn upon the absolute size of the 
difference but upon a combination of this and the size of the 
population, it does not belong in the same category as the pre- 
ceding three. Indeed a good interpretation of a difference 
(especially between means) should include both one of the previ- 
ous three showings and this one on reliability in addition. 

In our chapter on Reliability of Differences we gave the neces- 
sary formulas for the computation of these reliabilities. We shall 
here merely apply some of them to a typical experiment. We 
choose for this purpose part of a small experiment by John A. 
Cooper (Pennsylvania State master’s thesis on “The Relation of 
Participation in College Athletics to Academic Success as 
Measured by Objective Tests”). The data we wish to use are 
set forth in Table XL. The students were matched on intelli- 
gence test scores and attainment was measured by the Carnegie 
Foundation Test for College Seniors. The differences by pairs 
of matched students are given in the column farthest to the 
right. 

The mean of the athletes’ score is 507, that of the nonathletes 
530, and the difference 23. This last quantity tallies, as it 
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TABLE XL.—Comparative Scores or MATCHED ATHLETES AND 
NoNATHLETES ON THE CARNEGIE ACHIEVEMENT TESTS AT 
PENNSYLVANIA Strate COLLEGE, 1928 


f 

Athlete Nonathletes 

- - - - 

Ce e ean 

dent ence 

score core score score 

D 81 529 Ç 81 426 +103 i 

F 79 | 410 | P 79 | 332 + 78 
M 91 583 | N 91 797 —214 

P 96 380 F 94 486 —106 | 

R 9 | 580 | H 89 656 — 76 | 
s 115 589 | R 109 514 +75 
v 73 | 539 | ¢ 73 489 + 50 
Cc 116 824 B 124 751 + 73 

M 98 | 506 | H 98 538 — 32 i 
M 105 | 3832 | F 105 455 +377 
wW 95 595 |A 95 799 —204 
B 9 | 592. | R 93 564 + 28 
F 99 | 286 | B 102 599 —313 

A 92 | 397 | P 92 638 —241 | 
B 9 | 350 | P 96 588 —238 
M 87 | 582 | H 87 491 + 91 
R 97 | 544 |H 97 630 — 86 
g 64 | 345 | M 64 526 —181 
J so | 542 |s 39 545 =) 33 
P 90 | 592 | aw 90 590 +2 
8 61 535 | N 63 466 + 69 
E 75 339 | w 75 372 Sosi 
M ALO MMSE) ANE t Tis 584 +273 

E 11 571 |B 11 537 + 34 | 
K 88 | 555 | P 88 541 +14 
M 9 | 51 |@ 97 477 + 34 
D 76 484 F 76 382 +102 
w 67 | 408 | M 71 537 —129 
R 8 | 523 | H 84 284 +239 
s 80 | 3387 |G 59 343 + 44 
G 89 | 381 | @ 89 870 —489 
G 89 | 356 | A 87 467 Tii 
c 63 EAA 62 417 — 55 
E 81 3755 | E 84 544 —169 
M 75 569 | K 72 420 +149 
w 7 | 304 | u 77 424 — 30 
Means 88.5 507 +. | 87.5 | 530 — 23 
Sigmas_ | ss | 185 |. | 14.9 | 130.44 | 166.25, 


Intelligence score call factor 1; athletic-achievement score, 2; and non- 
athletic achievement score, 3. rı? = -564, ris = .492, ra = .215. 
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must, with the sum of the differences by pairs (last column) 
divided by N. By formula (95), the standard error of this 
difference is 


The ratio of the difference to its standard error is 


23 
t= 277 7 0.83 
This is far short of the conventionally demanded ratio of 3. But 
as we pointed out earlier, there is no magic in a ratio that reaches 
exactly 3, although it is true that near this point the odds against 
reversal begin to mount extremely rapidly with increasing ratios. 
On pages 169 to 170 we explained the method of interpreting 
such ratio of the difference to its standard error. Reference to 
the table of integrals of the normal curve shows that, if the true 
difference were zero, we would expect to obtain a difference as 
much as 0.83 standard errors above zero in 0.203 of the trials 
while we would expect the opposite in .797 of the trials. The 
chances are, therefore, 3.9 to 1 that the true difference is in 
favor of the nonathletes. 

Are athletes more variable in attainment than nonathletes? 
The standard deviation of the scores of the former is 135 while 
that of the latter is 130 a difference of 5. Is that a reliable 
difference? The reliability formula required here is (104) 


Coos = VOo + Co? — 225061805 


[È+ ó — rion 
SAN 2N 


The r here is .215 which when squared gives .04. To use this r 
here will really make only an insignificant difference, but we shall 
do it anyway in order to illustrate the principle. Then 


JE + 180 = (2)(.04)(135)(130) _ 21g 
72 ; 


Coros 


Dividing the difference by its standard error, we get 


bt 
e 0.23 
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TABLE XL.—Comparative Scores or MATCHED ATHLETES AND 
NONATHLETES on THE CARNEGIE ACHIEVEMENT Tusts AT 
PENNSYLVANIA Srare COLLEGE, 1928 


Athlete Nonathletes 

Intelli- | Achieve- Stu- Intelli- | Achieve- . Differ- 

Student gence ment dent | 8enee ment sae 

score core score score 

D 81 529 Cc 81 426 +103 
F 79 410 ee 79 332 + 78 
M 91 583 N 91 797 —214 
ia 96 380 F 94 486 —106 
R 90 580 H 89 656 — 76 
S 115 589 R 109 514 + 75 
Y 73 539 (H 73 489 + 50 
Cc 116 824 B 124 751 + 73 
mM 98 506 H 98 538 — 32, 
M 105 832 F 105 455 +377 
Ww 95 595 A 95 799 —204 
B 95 592 R 93 564 + 28 
F 99 286 B 102 599 —313 
A 92 397 P, 92 638 —241 
B 96 350 E 96 588 —238 
M 87 582 H 87 491 OL 
R 97 544 H 97 630 — 86 
S 64 345 M 64 526 =181 
J 89 542 S 89 545 =i 3 
? 90 592 M 90 590 +2 
S 61 535 N 63 466 + 69 
E 75 389 WwW 75 372 EL? 
M 119 857 K 115 584 +273 
E 111 571 B 111 537 + 34 
K 88 555 F 88 541 + 14 
M 95 511 G 97 477 + 34 
D 76 484 |F 76 382 +102 
T 67 408 | M 71 537 —129 
R 85 523 H 84 284 +239 
S 89 387 G 59 343 + 44 
G 89 381 G 89 870 —489 
a 89 356 | A 87 467 Si 
G 63 362 |s 62 417 as 
E 81 375 E 84 544 —169 
M 75 569 K 72 420 +149 
wW 77 394 M 77 424 — 30 
Means 88.5 | 507 87.5 | 530 = 23 
Sigmas 135 14.9 | 130.44 | 166.25 


Intelligence score call factor 1; athletic-achievement Score, 2; and non- 


athletic achievement score, 3. 


r12 = .564, ris = .492, ra = 215, 
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must, with the sum of the differences by pairs (last column) 
divided by N. By formula (95), the standard error of this 
difference is 


The ratio of the difference to its standard error is 


S23 
EZTIT. 
This is far short of the conventionally demanded ratio of 3. But 
as we pointed out earlier, there is no magic in a ratio that reaches 
exactly 3, although it is true that near this point the odds against 
reversal begin to mount extremely rapidly with increasing ratios, 
On pages 169 to 170 we explained the method of interpreting 
such ratio of the difference to its standard error. Reference to 
the table of integrals of the normal curve shows that, if the true 
difference were zero, we would expect to obtain a difference as 
much as 0.83 standard errors above zero in 0.203 of the trials 
while we would expect the opposite in .797 of the trials. The 
chances are, therefore, 3.9 to 1 that the true difference is in 
favor of the nonathletes. 

Are athletes more variable in attainment than nonathletes? 
The standard deviation of the scores of the former is 135 while 
that of the latter is 130 a difference of 5. Is that a reliable 
difference? The reliability formula required here is (104) 


Corny = V Co + Oo — 23:00:00 


oe j + a3 — 27330208 
it 2N 


The r here is .215 which when squared gives .04. To use this r 
here will really make only an insignificant difference, but we shall 
do it anyway in order to illustrate the principle. Then 


q +130 = 2)(.04)(135)(130) 21g 
72 


t = 0.83 


Coso; = 
Dividing the difference by its standard error, we get 


D 
t-e = 0.23 
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This is too small a ratio to give any appreciable assurance that 
further sampling will not show the true difference to lie on the 
opposite side. The odds are only 1.5 to 1 that the true difference 
lies in the indicated direction. 

We have made here the “large sample” type of interpretation, 
which is appropriate for an N of 36. This form of interpretation 
assumes that s — øs is distributed normally. As a matter of 
fact, the distribution of standard deviations from samples is 
not normal and neither, in consequence, is that of the difference 
between standard deviations. But they approach normality as 
N increases, and, with N = 25 or more, the error in taking them 
to be normal is small and is not very great even with an N as low 
as ten.’ But for an “exact” interpretation in small samples the 
probability of getting in a sample a divergence between o’s as 
great as the one in hand must be determined from the distribution 
of sı/s2 in the manner discussed on pages 335 to 337. The 
probability of getting a given divergence as measured by this 
ratio is the same as the probability of getting the same divergence 
in the form of a difference between these same sample values, 
because it is the probability of getting simultaneously sample 
values as far apart as these or farther when the true difference is 
zero. 

Do the attainments of athletes correlate more highly with 
“intelligence” than those of nonathletes? In order to determine 
this, we must compute the 7’s between intelligence test scores 
and achievement test scores for both the athletic and non- 
athletic group. For the athletes this r is .564 and for the non- 
athletes it is 492. There is a difference of .072. Formula (107) 
would be the correct one to use. But in practice we usually 
ignore the r between the 7’s unless very precise results are required 
with a population large enough to justify such attempts at 
precision, In view of the low correlations in our problem, and 
the smallness of our population, we shall use formula (109) which 
is not only sufficient for our purpose here but will usually be 
sufficient when comparing the r’s in controlled experimentation. 


Grr, = Vor, + 02, = V0191 + .0210 = .20 


The ratio of the difference between the r’s to the standard error 


1See Pearson, KARL, Biometrika, Vol. 10, p. 529; or Dumrna and Biren, 
“Theory of Statistical Errors,” Rev. Modern Phys., Vol. 6, p. 129. 
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of that difference is .86, which is again too low to have satisfac- 
tory statistical significance. 

The difference between the means is 0.18 standard deviations. 
In spite of the fact that the difference between the means is 
slightly in favor of the nonathletes, only seventeen of the non- 
athletes exceeded their mates in achievement score while nine- 
teen athletes excelled. These advantages are scattered so 
miscellaneously over the range as not to suggest any relation 
between intelligence and the effect of athletic participation upon 
scholarship. This experiment shows slight if any difference in 
academic attainments between athletes and nonathletes at the 
college level. If there are real differences, our number of sub- 
jects is too small to demonstrate them. A little later we shall 
discuss the question of number of cases required to show decisive 
results where there are true differences but very small ones. 

The example treated above employed measurements of only 
end differences. Table XLI gives data from an experiment in 
which measurement is in terms of gains. It is from a Pennsyl- 
vania State College master’s thesis by Boyd M. Beagle on the 
effect of technical analysis in the teaching of appreciation of 
poetry. The experimental group is the one for which technical 
analysis had a place in the teaching between the time of taking 
initial measurements and the time of final measurement, while 
from the teaching of the control group such analysis was absent. 
Our formula for reliability in such application is (97), given on 
page 168. It is left for the reader to apply, as are also the other 
desirable interpretations. 

The most carefully controlled experimentation involves well- 
matched groups. But sometimes experimenting is done with 
random groups. This was especially true of the early experi- 
ments, but sometimes occurs at present. A notable recent 
example is An Experimental Study of the Educational Influences 
of the Typewriter in the Elementary School Classroom by Ben D. 
Wood and Frank N. Freeman. When the number of individuals 
or groups compared is very large, random selection is fairly 
likely to bring about near equality in learning ability between 
the two sides. But even in the Wood-Freeman experiment, 
involving nearly 15,000 pupils, differences between the groups 
in scores of aptitude were sometimes appreciable, as were also 
differences in the average quality of the teachers on the two 
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Tasia XLI—Scores IN Jupeine POETRY BY AN EXPERIMENTAL AND A 


CONTROL GROUP—ABBOTT-TRABUE TEST 


Oo AHAAABDAMANDEHDAMAMMAMHANONHNOMOOHHAN 
2 Psi iia l 11 Í TEN 
A 
> SHOMAAMMMOMNONONDAAHNONMONAANMDNAOMHM 4 
5 Ish ee emer il I l 
T > 
S 
& 
= n HONNAN HOM HA AN M me OD O aN MN A N O H H O aa o H 
gja 
5 
8 
is) 
wa HNNAN ONOMA HNA M H H A O OAN MMN mna AN H O dH HID 
< 
a 
a) 5 
o 
g 
i) 
aJa 
E 
3 
R 
É 
Gi RS RG Sacer ace de eon i Top eho Seca ES, 
a Sette HB HANNAN NANANN HO OO 


P.N. = pair number. 
difference in gains, 


E.S. = end score. 


D.G. 


I.S. = initial score, 


G. = gains. 
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sides. A very much smaller number of subjects carefully 
matched would give more decisive and less ambiguous results 
than a larger number only loosely matched. 

When random rather than matched groups are employed, the 
r in the third term of the reliability formulas is to be regarded 
as zero, so that the third term drops out. For difference between 
means, the formula then becomes 


Om-m = V om aT Tn, 

This formula is frequently employed even when the contrasted 
groups have been matched. Then the obtained standard error 
of the difference is higher than it should be, and the indicated 
reliability is too low. It may be worth while to ask how much 
too low. If the correlation were perfect and the two standard 
deviations were equal, the (—2rem,cm,) would completely offset 
the (o3, + ¢%,,), so that the resulting standard error would be 
zero. If the r = .50, the residuum under the radical would 
be half as great as if the r were not considered and hence the 
standard error, »/.50 or .70 as large. If the r were only .20, 
the standard error would be »/.80 = .89 as much. So, when 
the r is large, consideration of it makes a vast difference, particu- 
larly in view of the fact that the odds increase much more rapidly 
than the standard error decreases; but, when the r is small, the 
effect of its consideration is negligible. 

If the r belongs and has not been used, it may be of interest 
when evaluating an experiment to speculate on the amount of 
error involved. The amount of correlation to be expected 
between the end scores may be inferred from the r between the 
matching factor and the end scores, which is sometimes given 
and more often can be roughly guessed. If the two end arrays 
are uncorrelated except through the matching factor, the partial 
r with the matching factor held constant will be zero. Thus, if 
the subscript 1 refers to the matching factor and 2 and 3 to the 
two arrays of end scores, 


be T23 — Tie? 13 =0 
BN ME eV = hs 
Clearing of fractions and solving, 
Ta — Tir = 0 (222) 
T23 = T1233 
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Thus the cross correlation can be expected to be the product 
of the 7’s between the matching scores and the scores in each 
of the end arrays. If the two r’s between matching and final 
scores are equal, the r between the arrays of end scores is the 
square of the r between end scores and matching scores. If 
subjects are matched on intelligence test scores and achievement 
is measured in terms of school grades, a correlation of from 
about .30 to .50 may be expected between matching scores and 
achievement so that the cross correlation may be expected to 
run from .09 to.25. If matching is done on an objective achieve- 
ment test closely related to the final one, an r of .70 or .80 may be 
expected between matching and achievement arrays and an r 
between the two arrays of end scores of .49 to .64. 

A trial of this formula on Cooper’s data, Table XL, gives 


Ta = (.564)(.492) = 277 


The correlation obtained by computation is .215. In so small 
a population the assumptions are not fulfilled sufficiently well to 
give more than a reasonable approximation to the correct r. 


ON CORRELATIONS WHERE GAINS ARE MEASURED 


If measurement is made jn terms of gains between initial and 
final status, the 7’s between the gains by the two groups must be 
expected to be very low except where the subjects are near the 
beginning of the growth curve in the function under study at 
the time the investigation begins. That is because, as the curve 
flattens out, gains correlate much less with initial scores than 
end scores do, so that, when squared, these 7’s suggest practically 
a zero correlation between the two arrays of gains. An increase 
in standard deviation between initial scores and final ones 
suggests a positive correlation between gains and initial status, 
while the absence of an increase or an actual decrease in the 
standard deviation suggests a zero or a negative correlation. 
The chief factor in making toward such low or negative correla- 
tions between gains and initial status is unreliability in the tests 
used for measuring gains. We have been able to show (although 
we shall not here consume the necessary space to give the deriva- 
tion) that a negative correlation between gains and initial 
Status in the same function in which the gains are measured is 
attributable to unreliability in the measuring instrument and 
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that this correlation has the following magnitude: 
Tag = — V a(1 om T32) (223) 


where 2, is an initial score in the function, r12 is the reliability 
coefficient of the test, and gu is that part of a gain between first 
and second testing that is due solely to the unreliability of the 
testing instrument. If the matching is not on initial status in 
the function in which gains are to be measured but on some 
outside criterion instead (say a general intelligence test, which 
we shall label with the subscript 0), and it is desired to know the 
r between gains due to unreliability and this matching factor, we 
can show that 


Tou = —TaVa(l — T12) (224) 


It is because the positive r between truly measured gains and 
truly measured initial status only partly offsets this negative 
correlation due to unreliability in the measuring instruments 
that the r between gains and the matching factor so often proves 
to be zero or slightly negative. This argument also shows that, 
where results are to be measured in terms of gains, matching 
is much less important than where comparisons are in terms of 
only end scores. 

Sometimes the occasion arises to infer what the r would be 
between some experimental factor (£a) and true gains in another 
function. For example, we take initial and final measures of 
academic achievement and thus obtain fallibly measured gains 
(g). We compute the r between these fallibly measured gains 
and the extent of participation in extracurricular activities. 
We wish to infer what the r would be with the disturbing effect 
of the unreliability removed. We need, therefore, the r between 
gains and ECA with the gains due to the unreliability of the 
test held constant, Our regular formula for partial correlation is 


Trag — Tzagut ggu i To, * 1 (225) 


foo. = SSO 
aen W/L iy, VU th, 
The rz we can compute from our data. 


Ty, can be shown to equal (202,/o,)V3(L = 712) (225a) 
Taau can be shown to equal Taun VA = 712) (225b) 
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We calculate these several r’s and substitute them in our partial 
correlation formula, 
COMBINING SEVERAL TRIALS 

Sometimes we may wish to combine into a single showing the 
results from a number of trials in an experiment. We then 
need the difference between a sum of means by the experimental 
group and sum by the control group. If individuals have been 
paired, the standard error of such difference is simply the stand- 
ard deviation of the column representing the differences between 
corresponding sums of scores by the paired individuals when 
this standard deviation has been divided by the square root 
of the number of pairs. That is, we make the same combination 
of scores of individuals as we intend to make of means, take the 
standard deviation of the resulting array, and divide this by 
VN. This involves formula (98). If subjects are not paired 
individually and yet groups are matched, so that an element of 
correlation is present, the formula for the standard error is 
easily derived by anyone for any combination he needs as 
follows: 


g P m, tm se) (im ce) 
= Z(t + ms + ms + ++ + — mg — my — m) 
S 


2 Sm? 2 
ee get sg tea 
22mm: _ 25mm, eais 
tai SERE 
22mm _ 2myms 


Ss 5 

22 
os SS + 2B Ee 
= on, tom, Hon + ++ tol, tod, Fos, 
Ft + raman + ramon +++ 
= 2rutmom, ° + + 
4 oe Continue with al - 
F Brum gm, ing posible combat eona (220) 


If the groups are random ones instead of matched ones, all 7’s 
are zero and only the terms of the form oa, remain. But it 
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should be remembered that only closely similar units are to be 
thus additively combined. 


A REGRESSION TECHNIQUE FOR MATCHING GROUPS 


We can employ a regression technique for the hypothetical 
matching of groups, which obviates the necessity of having pre- 
cisely matched pairs. In our control group we determine the 
regression of end scores upon the matching scores. Then we 
predict for each member of the experimental group the end score 
he would be expected to make on the basis of his learning ability 
as indicated by his matching score. If the mean of the obtained 
scores is significantly greater than that of the “expected” 
scores, the experimental factor is indicated as having a differential 
potency in contributing to growth. We shall apply this tech- 
nique to Cooper’s experiment reported on page 454. We need the 
ordinary rectilinear regression equation and need to make predic- 
tions by the method explained on pages 110 to 112. Letting X, 
be the intelligence test scores for the nonathletes and Xs their 
final scores, letting X; be a predicted score rather than an 
obtained one, and employing the numerical values of the sta- 
tistics as computed from Tables XL and XLII, 


O3 T3 
Riya t (Xi) + (a. TA u.) 
130.44 130.44 ov » 
= 492 Ay Xi+ (530 — .492 49 87.5) 
We now apply this regression equation to predicting what 
should be the final score for each member of the experimental 
group in case the experimental factor had no differential effect. 
To do this we merely enter in succession as Y, the intelligence 
test scores of the members of the experimental group. Rewriting 
the equation in terms of Y} with the complex expressions evalu- 


ated, we have 
Yi = 4.307Y1 + 153.12 


The first athlete in the table has an intelligence test score of 81. 
Substituting this for Yı in the regression equation, we predict 


1 This is similar to Fisher’s covariance technique. Obviously it can be 
applied to agricultural or sociological or to other data with suitable control 
factors and the substitution of the appropriate outcomes for “growth,” 
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for him a score of 501.99. He actually made a score of 529, so 
that for him Y — Y, = +27.01. The predicted scores, attained 
scores, and differences for the 36 individuals are entered in 
Table XLII. 

The mean difference of the predicted scores from the attained 
scores is —27.47, so that the athletes had lower scores on the 
achievement test than nonathletes of corresponding intelligence. 
In order to test whether or not this is a significant difference, 
we need to know the standard error of this mean difference. 
We shall take first the case where the experimental group and the 
control group have the same N and where the two groups are 
perfectly equated. 

Let y be an obtained end score in the experimental group 
(instead of ys, for the moment), and let y’ be a predicted one; 
similarly, let z; be a predicted final score for the control group. 
Let us take both 7 and 9’ as deviations of sample means from the 
means of the whole of their respective populations. Then 


2 2 2 
5-7 = Oy + oF — rooy 


But, in each sample, the mean of the predicted y’ scores is the 
same as the mean of the z4 scores (which also equals the mean 
of the xs scores), since they were predicted from initial scores 
perfectly equated with the corresponding 2; scores, Hence, 
making this substitution, 


oy =o + 02, — 2rajoz on 


SaL f 
SN EN TV 
AN a A R A ess laa Gr Fy 
wW. ely [Be oy he] 


Again, in the sample, 
Gy = % + oy — Bryoysy 


Now ty = Ty because each y’ is the corresponding yı multi- 
plied by a constant, b. oy is the standard deviation of the scores 
predicted as lying on the regression line, and (page 240) we 
showed that this equals roy. However, since the y’s are pre- 
dicted from the x regression line, cy = Tazz, Making these 
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substitutions, we have 
ogv = Oy + 02732, — Wy, 2200s 


But we showed on page 459 that, assuming no correlation except 
through the matching factor, 


Talun, = Tay, 

Substituting this and dividing by (N — 1), 

Crp a NERA te Oz;0y 
(8) Voip t yet Maly oN 
But, allowing for the fact that the o’s in (A) are the population 
variances and those in (B) are the sample variances, this quantity 
is the same as the quantity in brackets in (A). We can simplify 
(A) by substituting (B) into it. We also adjust the term outside 


the brackets to the sample value by dividing by (N — 1) instead 
of by N. We then have 


(Sampling variance of 
the difference between 
cag eens (oar 
ii N=—1 when experimental (220) 
and control groups are 
perfectly equated) 

That is for the case of perfectly equated groups. We need 
amore general formula. Since we are using population variances, 
our formula need not be affected by differing N’s for the experi- 
mental and the control groups. But if the mean of the matching 
scores of the experimental group differs from that of the control 
group, we need an adjustment for that difference. In that 
case o2, cannot, for the purpose of substitution for No#, be taken 
from the mean of its own array but, instead, from that mean 
plus b(žı — ğı), where b is the regression coefficient. The 
variance of this further factor must be added to the variance 
due to the other factors (since it is independent of the others). 
This will give us, for our general case, where the subscripts 7 
and f stand for the equating scores and the achievement scores, 


respectively, 


2(1 — 72 
“A oz (1 12.) 


i Ne 


od dey GRA = fae) 
Ka Nz-1 N,-1 (Wz — 1)03, 


z 
(General formula for the standard error of the difference between 
means of predicted and obtained scores in an experimental (228) 
comparison for one matching factor) 
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TABLE XLII.—Prepicrep SCORES IN Comparison WITH ATTAINED SCORES, 


Coormr’s EXPERIMENT 


RES WARES RHE NAVARRE e USSR EAGa S 


Mean 


Intelligence | Predicted | Attained s 
Student eek Been Fes Difference 
81 501.99 529 
79 493.37 410 
91 545.06 583 
96 566. 59 380 
90 540.75 580 
115 648.43 589 
73 467.53 539 
116 652.73 824 
98 575.21 506 
105 605.36 832 
95 562.28 595 
95 562.28 592 
99 579.51 286 
92 549.36 397 
96 566.59 350 
87 527.83 582 
97 570.90 544 
64 428.77 345 
89 536.44 542 
90 540.75 592 
61 415.85 535 
75 476.15 389 
119 665.65 857 
111 631.20 571 
88 532.14 555 
95 562.29 511 
76 480.45 484 
67 441.69 408 
85 519.21 523 
89 536.44 387 
89 536.44 381 
89 536.44 356 
63 424.46 362 
81 501.99 375 
75 476.15 569 
77 484.76 394 
88.5 534.53 507.06 
135 


Standard deviation 
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Applying this formula to Cooper’s data, we get 


yy 
a ye = 492) , 117.087 |, (17030400 = 402) 
36 —1 36 —1 (6 = 104.9) 
th erste 
= 27.6. t= 375 = 1.00 


On page 455 the standard error computed by the Momy 
method was found to be 27.7, a result in remarkably close 
agreement with the one found here. But the difference between 
means by that method was only 23 instead of our present 27.47. 
That discrepancy is due to the fact that the two groups were not 
perfectly equated; the athletes (experimental group) had a mean 
intelligence test score one point higher than the control group, 
and that one point difference in matching score carried the 
expectation of an extra achievement score of just about four 
points. Although for all practical purposes the two procedures 
tell the same story, the regression technique was really more 
accurate than the conventional matched-group technique because 
it showed what would be the expected difference if the groups 
were perfectly matched, as well as the reliability of that differ- 
ence. Of course, in practice we would not employ the regres- 
sion technique where we had the groups already matched, but 
we used it here on that type of problem in order that we might 
make comparisons between the methods. 

For equating on a combination of several factors, y’s should 
be predicted by the partial regression equation (see Chap. VIII). 
Everything will behave in the same manner as with single matching 
except that the multiple correlation coefficient R2,(t1,%2, . > - 5 tr) 
will replace r2, in the first term under the standard-error radical 
and the third term under the radical will become 


Al Raeina aH 
Na— 1 Riiie (AD h 


where Z; is the mean of any one of the matching factors in the 
control group and J; is the mean of the corresponding matching 
array for the experimental group, the task calling for the summa- 
tion of all the differences between such matching means squared 
and divided by the partial variance of its own 2; array. 
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For the number of matching variables indicated on the left, 
the partial sigma values required for the denominator are given 
at the right. Beyond four matching factors (if desirable to use 
more than four, which is doubtful), the worker should resort to 
the Doolittle method to compute multiple R for 


oliin +5, = FELL — Reais, --- ap] 
For simplicity we shall write 1, 2, 3, etc., for iriz . . . te 


No. of 
Variables 

2. oa = All — ri); oza = o3(1 — rie) 

3 olas = AC — ris)(1 — ries) 
ahaa = 03(1 — 733)(1 — rizs) 
aiaa = 03(1 — 135)(1 — Tis-2) 

4 Chas, = (L — ria) (1 — riea) (1 — riaa) 
hasa = o3(l — T3) (1 — rias) (1 — Tiasa) 
a3 = (L — 735)(1 — Tiea) (l — riso) 
ohars = A — ri) — riaa) — rinna) 


The formulas for the required partial r’s can be found on pages 
248 to 244 of this volume. 

This third term in the standard-error formula is the most 
tedious of all the parts of the work, and yet it makes a trivial 
difference in the value obtained if the groups are reasonably 
close together in the means of the matching elements. If the 
groups are not far from equal in equating scores and no great 
exactness is required, this third term of the standard-error 
formula might be dropped; the worker, then, should remember 
that the obtained standard error is a trifle smaller than the 
correct one. y 

The above formulas are based on large sample theory. If, 
because the sample is small, the ¢ is to be interpreted in terms of 
Student’s distribution, each (N — 1) in the denominator should 
be replaced by (N — k — 1), where k is the number of matching 
factors. Then use Table XII or XLV and XLVI instead of the 
table for the normal distribution, entering the table with 


(Nz + Ny — k — 2) 


degrees of freedom, 
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It will be observed that this technique involves matching 
each experimental subject with a hypothetical one in the control 
group who stands on the same point on the z axis, then comparing 
his final score with that of the mean of a hypothetical class 
of controls who had the same matching score as his. In the 
matched-group technique we equate the actual scores of paired 
subjects and compare the actual end scores of the same two. 

We applied the regression technique to Cooper’s experiment 
for the sake of comparison of methods. Ordinarily we would not 
apply it to that4problem because the groups were already satis- 
factorily matched by pairs. The regression technique is particu- 
larly useful where the subjects cannot be paired without too much 
sacrifice in size of population. The two groups need not be 
exactly equated for means on the matching factor, but to the 
extent to which the control group lies on a different level from the 
experimental group on the matching criterion, to that extent we 
make a hazardous assumption of rectilinearity of regression 
beyond the range of the control group scores. There is no need 
that control and experimental groups have the same number of 
subjects. Each group should be as large as available populations 
permit, although there would be loss rather than gain by includ- 
ing subjects who were clearly abnormal from the standpoint of 
any of the conditions of the experiment. 


INCREASED RELIABILITY FROM REPLICATION OF EXPERIMENTS 


We shall now return to a further consideration of the relation 
of the size of our sample to the reliability of our differences. We 
have had abundant occasion to see that the standard error of a 
difference varies inversely as the square root of the number of 
subjects. When true differences are small, as they often are, it 
is not possible to establish them reliably with such small groups 
of subjects as we may have at our command. Let us get this 
fact before us more vividly in terms of a formula for the number 
of subjects needed to give a standard error of a predetermined 
size when the true difference is believed to be a given amount. 
Let D stand for the true difference and ¢ be the ratio of a differ- 
ence to its standard error upon which we intend to insist. 


Die D 
Taa (È + 03 — 2rowa)/N * 


ġ= 
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Let us assume that the two o’s are equal. Then, squaring and 
rearranging the N, 
SEANDA 
Lire aan =a 
y2 2#o?(1 — r) 


F (229) 


From a series of experiments! the authors concluded that the 
true difference to be expected from a certain kind of moral 
instruction in school is about 0.4 of a standarddeviation. Tak- 
ing this standard deviation to be substantially the same as the 
one of our formula (the latter can be made the same by computing 
it from the two distributions combined instead of from one of 
them) and ignoring the element of correlation, the number of 
pairs of pupils indicated to yield a ratio of 3 would be 


N= Dos hats 


This is the number required to give a ratio as high as 3 in half 
the trials. If we demand a ratio of 3 in as many as five-sixths 
of. the trials, we must substitute 4 for t instead of 3, and our 
required number is 200. 

Since it is often impracticable to have as large experimental 
groups as those suggested above and since many situations 
involve differences that are important even though small, our 
dependence must frequently be placed upon the reliability of a 
set of replicated experiments. When the outcomes of a number 
of experiments point in the same direction, the reliability of the 
set is much greater than that of any one taken alone and much 
greater, too, than the average of the reliabilities of the several 
samples. We are, therefore, led to the consideration of the 
reliability of a set of experiments, 

It is a fundamental principle of the mathematics of chance that 
if the probability of the occurrence of an event is p under one 
condition and q under another condition, it is pg for the two 
conditions combined. Suppose, then, the probability is 7 in 
100 that a given difference would have been obtained if the true 
difference were zero or less in one experiment and 12 in a 100 


1See Journal of Educational Sociology, December, 1933, the whole of 
which number is devoted to these experiments, 
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that a given other difference would have been obtained if the 
true one were zero or below, the two experiments being entirely 
independent of each other. On the above mentioned principle 
the probability is only (.07) (.12) = .0084 that such differences 
would have been obtained both times in the two independent 
trials if the true difference were zero or below. This same 
principle would hold for any combination of additional probabili- 
ties where always the same type of event occurred; t.e., where 
always the difference turned out on the same side. Thus the 
odds are very great that the true difference lies on the indicated 
side when it continues in successive samples to be found con- 
sistently on that side, even though the odds indicated at any one 
trial are not high. 

When the odds are low at the first few trials, it is extremely 
unlikely that the differences will continue to fall consistently 
on the same side. If they do so, that fact suggests unusual (but 
not impossible) deviation from the laws of chance or else some- 
thing erroneous about the calculation of the probabilities. When 
the odds are low in the individual samples, it is to be expected 
that some advantages will fall on one side of zero and some on the 
other. Even in such case, however, the reliability of a set of 
samples is greater than the average from the samples taken singly. 
In order to put this fact in better perspective, we shall develop 
some statistical formulas expressing the relations involved, not so 
much with the purpose of actually applying them to calculate 
joint probability but rather as a general basis for the interpreta- 
tion of the effect of replicating (repeating) an experiment. 

We assume an experimental factor running through all the 
trials equally potent to separate the contrasted groups in all 
trials, except for the effect of chance errors of sampling. Let 
Ma, Na, Ns, ... be the number of pairs of individuals in the 
several experiments. Similarly let Dı, Ds, Ds, . . . be the dif- 
ferences between means in samples, and let tı, tz, ts, . . . be the 
ratios of the several differences to their standard errors, Then 


(©) = 


(D) ETIE, 


where og, is the standard deviation of the array of paired differ- 
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ences in experiment 1. Furthermore, Dı = Ma, whence 


Zd. 
(£) D: = Ea 
Thus, substituting (D) and (E) in (C), 
dıy n 
(F) i= ae 
2dı 


oa mM 


and, multiplying through by V/m, t/m = di/oa. 

Thus we have the paired differences in “standard (z) scores” 
taken from zero as origin. If now we use the symbol z for these 
scores and sum for all the experiments, we have >Dza = Div/n, 
it being thus indicated that each ¢ is to be weighted by the square 
root of its corresponding n. If we assume that these deviations 
divided by the standard deviations of their respective series are 
sufficiently similar to permit averaging them without distortion 
of meaning, the difference between the means of the summed 
experimental and control groups will be the same as the mean 
of all the paired differences; viz., 


_ Brza _ ivn 
@) Di m Sn 


The ¢ for this set of combined z scores will be, by analogy with 
(F), 


_ iVn: dn 
t = TTE aL (230) 


where now the Ta, is the standard deviation of the whole con- 


solidated population of paired differences. If the zas were taken 
as deviations from the true mean instead of from zero and if the 
z scores were standard scores from the total population instead 
of from the subgroups, this oe, would be 1, which is always the 


value of the standard deviation of a set of z scores. This latter 
condition need not worry us, since the sigma of a standard-error 
formula is properly that of the total population of samples 
rather than that of a single sample. The former condition 
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operates only to add a constant to all scores, which does not 
affect the variability. Thus the standard deviation is sub- 
stantially 1, and (230) simplifies to 


Lin/n 
fan 


This would lend ‘itself to calculation if we had the several 
ratios and the several populations. If we may assume sub- 
stantially equal populations in thé several samples, our formula 
further simplifies as follows: 


(H) i = 


= Z. Vn Vn L Z, faa = Ma (2300) 
n 1 an 

Thus, if the samples are assumed to be equal in size of popula- 
tion and equally potent in contributing to a validly measured 
difference, the ratio between the total mean difference and its 
standard error may be taken to be substantially the square root 
of the number of samples times the mean ratio. Thus from 25 
determinations the standard-error ratio may be expected to be 
five times as great as the average ratio from a single determination. 

The formulas of this section, just as those of the following 
sections, are not offered as useful formiulas for actually making 
a quantitative computation of the correct statistic; the assump- 
tions which must be made are too precarious to make that safe, 
except as a rough estimate. The argument is intended merely 
to stress the fact that the reliability of a set of experiments with 
differences prevailingly in the same direction is much higher 
than that of the average single trial. Butit must be noted that 
this applies only where the populations of the several experiments 
are independent of one another; it does not apply with full force 
to the case where the same pupils are remeasured, but only 
where there are added chance samples of the total population 
regarding which the generalization is to be stated. 

One of the authors! has shown elsewhere that summing together 
subtests, as when the experimenter gives a test at the end of 
each of a number of units in the course of the experiment and 
then sums the scores into totals in addition to making separate 


1 Perers, C. C., “Increasing Reliability in Controlled Experiments,” 
J. Educ. Psychol., Vol. 30, pp. 143-150. 
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showings, also increases the reliability of the experiment. Apart 
from the question of fatigue, the same effect would be achieved 
by very extensive tests at the close of the experiment. If the 
paired differences in a matched-group experiment correlate zero 
among the subtests, which is rather likely to be the case, the fol- 
lowing formula indicates the manner in which t may be expected 
to respond to summing together a comparable tests as compared 
with the ¢ from a single test: 


t = tiva (231) 


where 2, is the ratio of the difference to its standard error for 
the summed scores, t: is that of a single test, and a is the number 
of tests. This indicates tremendous gains in reliability from 
extensive testing. 


VALID MEASUREMENTS 


The previous section was directed against the practice of using 
short and meager tests for measuring outcomes of experimenta- 
tion and small populations. In this section we shall show the 
influence of validity of measurement in separating the means. 
Unfortunately it often happens in experimentation that com- 
mercial tests are employed to measure outcomes because they 
have prestige or because they are readily available, when they 
are not very closely related to the difference the experimental 
factor could be expected to make. That is, they may be valid 
for other purposes but not particularly valid for measuring the 
outcomes of the particular experiment in question. We shall 
show that the difference between measured outcomes in experi- 
mental and control groups is likely to be attenuated by reason 
of this lack of validity in the testing instrument. 

Let c stand for the elements which a test measures that are 
affected by the experimental factor, let b stand for other identifi- 
able elements validly measured as some kind of performance 
but not a performance affected by the experimental factor, and 
let e be chance factors caught in the measures. The test is, then, 
valid for its purpose to the extent to which it measures only the 
c factors. Each individual’s score will be made up of c +b +e 
factors. The difference between the means will be little affectéd 
by the b or the e factors, since they will tend to average about 
the same on the experimental and the control sides. But the 
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variability will be affected by the b and the e factors as well as 
by the c factors; it will be increased by reason of them. 

If the test were perfectly valid, the difference, measured in 
standard scores, would be D/o., while with an invalid test it is 
D/o(c4b4e). This latter is o-/o(c4o+.) times as great as the former. 
But o¢/o(c4+0e) = Tector) Which is the validity coefficient of the 
test—the correlation between its scores and perfectly valid scores 
of the same function. The proof of that is as follows: Consider 
our measures of c, b, and e to be in the form of deviations from 
the means of their respective arrays. Then 


De? + Icb + Tee 


To(etb+e) = Nocio 
o+b+e 


But, since b and e are uncorrelated with c, cb and Zee equal 
zero. Taking the N with the Zc? of the numerator, we have 


o Te 


TH (0+b+6) O(c+b+e) 


To(etb+e) = 


If D, represents the difference when validly measured and D 
the difference obtained by the somewhat invalid test, we have, 
from the last sentence preceding the proof, D = rD:, or D; = D/r. 
Thus the difference would be separated if validly measured to an 
amount equal to the obtained difference divided by the validity 
coefficient of the test for measuring the particular function dif- 
ferentiating between the experimental and the control processes, 
when we are talking in terms of standard scores. This will 
operate to raise the standard-error ratio since the ø of the 
denominator, which has been decreased by elimination of the 
irrelevant elements from the test, is the one which appears in 
the standard-error formula. 

The validity coefficient we are talking about here has nothing 
to do with the validity coefficients often published by test 
makers, which are correlations with scores from other trusted 
tests measuring presumably the same function. We mean by 
the validity coefficient the coefficient of correlation between scores 
containing no factors other than relevant ones and corresponding 
scores containing some such factors—an r that could have, 
under ordinary testing conditions, only a theoretical meaning but 
about which it is, nevertheless, illuminating to speculate. 
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Of course, in practice we could not ordinarily make this correc- 
tion quantitatively, because we do not know the validity coeffi- 
cient of the test for the purpose in hand. But this argument 
shows the danger involved in employing testing instruments in 
experimentation which have little pertinency to the experimental 
factor and do little justice to it, then taking seriously the small 
and unreliable differences thus obtained. This very often hap- 
pens in practice. It is not improbable that tests are sometimes 
employed for measuring outcomes in experimentation which 
are padded out by 90 per cent of irrelevant elements while 
failing to include another 90 per cent of the outcomes actually 
influenced by the experimental factor. Under these conditions 
the real difference would be ten times as great (in standard terms) 
as the obtained one. 


A SIGNIFICANT RATIO 


Finally, we must again protest the magic that is involved, 
chiefly for laymen in statistics, in a ratio of just 3 between a 
difference and its standard error. This is completely arbi- 
trary. Several other equally arbitrary ratios have been suggested. 
Fisher proposes 2, while McCall has obtained wide use of 2.78. 
All these ratios, except perhaps Fisher’s, are higher than are 
usually attainable in experiments in education and the social 
sciences. If one looks through the experimental literature in 
these fields, he will find that by these standards the vast majority 
of experiments turn out to show differences that are “not 
statistically significant.” That would be harmless enough 
if such outcome were not so frequently misinterpreted. It is 
often taken to mean that the two procedures are equal in value 
while the experiment may indicate odds of 10 to 1, or 100 to 1, 
or even 500 to 1 that one is superior to the other. Under such 
circumstances the evidence does not mean that the procedures 
are probably of equal effectiveness but only that it has not yet 
been conclusively proved that A is better than B. We should like 
to bet on the stock market with the odds 100 to 1 in our favor, or 
even 5 to 1; and in the same spirit we are willing to consider with 
more favor than we accord its rival a procedure that an experi- 
ment indicates to be superior by odds of much less than the 
740 to 1 that a ratio of 3 indicates, while we await more conclusive 


evidence, < 
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In order to make this:objection somewhat more tangible, we 
might tentatively suggest another particular ratio—if we could 
feel guaranteed that it, too, would not be taken as magic. If 
the reader will examine the shape of a normal curve, he will find 
a place where the curve bends to a maximum degree in thinning 
out the distribution into a long tail. This point, as one can 
easily determine by placing the third derivative of the normal- 


curve function equal to zero and solving for z, is 1/3 sigmas from 
the mean; that is, 1.730. We suggest that this might be taken 
as a standard for provisional acceptance of the findings of an 
experiment, and we propose that it might be named the working 
ratio. This point of maximum deflection is the place where the 
tail begins most radically to thin and hence where the odds 
begin to increase extremely rapidly with added z distances. Of 
course, that does not really make it any less arbitrary as a 
standard; but that ratio represents odds of 23 to 1, and those 
would seem to be high enough to gamble on while we seek more 
conclusive evidence. 


Exercises 


1. Work up Beagle’s experiment (page 458) in a number of ways. Of 
course, the population here is too small to justify elaborate statistical 
manipulation, but the data are presented because the scores are so small as 
to involve a minimum of penciling, and they will do well enough for practice. 

a. Find the difference between the means of gains for the two groups and 
the reliability of that difference. 

b. These groups were matched on general intelligence test scores. Al- 
though these scores are not given here, the pairs are ranked in descending 
order on the basis of them. Compute p between this general intelligence 
criterion and initial scores in appreciation of literature. On the basis of this 
finding criticize the use of general intelligence test scores as & matching 
basis in this experiment. 

c. Compare the groups on the basis of merely end scores. Does this 


comparison tell the same story as that told by gains? Which tells the safer 


story? 
d. In which group do final scores correlate more highly with initial scores? 


What is the importance of that, if any? j 
e. How do gains correlate with initial scores? With the matching ele> 


ment? Compare with formulas (223) and (224), page 461. 
f. What correlation is there between gains in the experiment: 


control group? Compare with page 459. A 
g. Try the difference method of correlation on Exercise e [formula (44), 


page 101], and compare its findings and its convenience with the usual 
product formula, 


al and in the 
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h. Compare the variability of gains in the experimental group with that 
in the control group, and interpret results. 

i. The Abbott-Trabue test has rather low reliability for this age level. 
What effect has that on the findings? You have from these exercises 
information on the reliability of this measure for these groups. Find it. 

2. Examine the write-ups of experiments in doctors’ and masters’ theses, 
and in journals, and compare their techniques with the recommendations 
of this chapter. 

8. If you wish to do Exercise 2 more systematically, make a score card for 
experimental technique, and with it systematically evaluate the techniques 
employed in a wide sampling of experiments in education, psychology, and 
the social sciences. 
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TABLE XLIII.—Normat PROBABILITY INTEGRAL ORIENTED IN TERMS OF q 
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TABLE XLIII.—Normat PROBABILITY INTEGRAL ORIENTED IN TERMS OF g. 
(Continued) 


| 
| 
-350 0.3853 -3704 300 0.5244 +3477 250 0.6745 3178 
«849 0.3880 -3700 299 0.5273 -3472 249 0.6776 «3171 
-348 0.3907 -3696 298 0.5302 -3466 248 0.6808 -3164 
-347 0.3934 -8692 297 0.5330 83461 247 0.6840 3157 
-346 0.3961 -3688 296 0.5359 -3456 246 0.6871 3151 
845, 0.3989 3684 295 0.5388 «8450 245 0.6903 3144 
344 0.4016 38680 294 0.5417 3445 244 0.6935 3137 
-343 0.4043 -8676 293 0.5446 3440 243 0.6967 3180 
342 0.4070 +3672 292 0.5476 -3434 242 0.6999 3123 3 
341 0.4097 3668 291 0.5505 -3429 241 0.7031 3116 
-340 0.4125 +3664 290 0.5534 8423 240 0.7063 3109 
+339, 0.4152 +3660 289 0.5563 3417 239 0.7095 3102 | 
-338 | 0.4179 8656 288 | 0.5592 | .3412 238 | 0.7128 | .3095 
-337 | 0.4207 -3652 287 | 0.5622 | .3406 237 | 0.7160 | .3087 
336 0.4234 -3647 286 0.5651 3401 236 0.7192 -3080 
-835 | 0,4261 -3643 295 | 0.5681 | .3395 235 | 0.7225 | .3073 | 
334 0.4289 3639 284 0.5710 +3389 234 0.7257 -3066 
333, 0.4316 3635 283 0,5740 3384 233 0.7290 -3058 | 
332 0.4344 3630 282 0,5769 +3378 232 0.7323 3051 | 
331 0.4372 3626 281 0.5799 3372 231 0.7356 3044 
830 | 0.4399 3621 280 | 0.5828 | .3366 230. | 0.7388] .3036 
329 0.4427 3617 279 0.5858 -3360 229 0.7421 -3029 
328 | 0.4454 3613 278 | 0.5888 | .3355 228 | 0.7454 3022 
827 | 0.4482 3608 277 | 0.5918 | .3349 227 | 0.7488 | .3014 
326 | 0.4510 3604 276 | 0.5948 | .3343 226 | 0.7521 8007 
325 0.4538 3599 275 0.5978 -3337 225 0.7554 2999 
B24 | 0.4565 3595 274 | 0.6008 | .3331 224 | 0.7588 | .2992 
323 | 0.4593 3590 273 | 0.6038 | .3325 223 | 0.7621 2984 
322 | 0.4621 3585 272 | 0.6068] .3319 222 | 0.7655 2976 
321 | 0.4649 3581 271 | 0.6098 | .3313 221 | 0.7688 2969 
820 | 0.4677 3576 270 | 0.6128 | .3306 220 | 0.7722 2961 
319 | 0.4705 3571 269 | 0.6158 | .3300 219 | 0.7756 2953 
318 | 0.4733 3567 268 | 0.6189 3294 218 | 0.7790 2945 
317 | 0.4761 3562 267 | 0.6219 3288 217 | 0.7824 2938 
316 | 0,4789 3557 266 | 0.6250 3282 216 | 0.7858 2930 
815 | 0.4817 3552 265 | 0.6280 3275 215 | 0.7892 2922 
314 | 0.4845 3548 264 | 0.6311 3269 -214 | 0.7926 2914 
313 | 0.4874 3543 263 | 0.6341 3263 213 | 0.7961 2906 
812 | 0.4902 3538 262 | 0.6372 3256 +212 | 0.7995 2898 y 
311 | 0.4930 3533 261 | 0.6403 3250 +211 | 0.8030 | .2890 
+810 | 0.4959 | .3528 +260 | 0.6433 | .3244 -210 | 0.8064 | .2882 
-309 0.4987 -3523 +259 0.6464 -8237 +209 0.8099 -2874 
+808 | 0.5015 | .3518 +258 | 0.6495 | .3231 -208 | 0.8134 | 2866 
+307 | 0.5044 | 3513 +257 | 0.6526 | .3224 -207 | 0.8169 | .2858 
+806 | 0.5072 | .3508 -256 | 0.6557 | .3218 +206 | 0.8204 | .2849 
+305 | 0.5101 | .3503 +255 | 0.6588 | .3211 +205 | 0.8239 | .2841 
804 0.5129 -3498 .254 | 0.6620 3204 -204 0.8274 «2833 
«303 | 0.5158 3493 +253 | 0.6651 | .3198 -203 | 0.8310 | .2825 
302 0.5187 -3487 +252 0.6682 -8191 +202 0.8345 «2816 
301 | 0,5215 | .3482 .251 | 0.6713 3184 201 | 0.8381 | .2808 
a S 
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TABLE XLITI.—Normat PROBABILITY INTEGRAL ORIENTED IN TERMS OF q. 
(Continued) 
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Taste XLIII.—Normat PROBABILITY INTEGRAL ORIENTED IN TERMS OF q.* 


(Concluded) 
a x/oz a 2/02 
.050 1.6449 .016 2.1444 0400 
049 1.6546 -015 2.1701 0379 | 
-048 1,6646 -014 2.1973 0357 
047 1.6747 -013 2.2262 -0335 
-046 1.6849 -012 2.2571 -0312 
.045 1.6954 O11 2.2904 0290 
044 1.7060 -010 2.3263 «0267 
.043 1.7169 -009 2.3656 -0243 
042 1.7279 -008 2.4089 0219 
041 1.7392 «007 2.4573 -0195 
«040 1.7507 -006 2.5121 0170 
«039 1.7624 -005 2.5758 0145 
.038 1.7744 004 2.6521 -0118 
037 1.7866 -003 2.7478 -0091 
-036 1.7991 .002 2.8782 0063 
-035 1.8119 3.0902 
-034 1.8250 
Eui =. a 


* The above table was adapted from the table by Kondo and Elderton, published in 
Biometrika, Vol. 22, pp. 368-376. The following table (Table XLIV) was adapted from 
Pearson’s “Tables for Statisticians and Biometricians.” Both tables are used by arrange- 
ments with the publishers through Prof. E. S. Pearson, editor of Biometrika. 
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4049 
4066 
4082 
-4099 


4115 
4131 
4147 
4162 
4177 
4192 
4207 
4222 
4236 
4251 
4265 
4279 
4292 
4306 


1691 
1669 
1647 
- 1626 
1604 
1582 
1561 
1539 
. 1518 
1497 
1476 
1456 
1435 
1415 
1394 
1374 
1854 
1334 
1815 


APPENDIX 485 
Taste XLIV.—Normat PROBABILITY INTEGRAL, ORIENTED IN TERMS OF 
a/oz 
z/on 50-4 z a/aoz 50-4 z a/oz -50-¢ z 

0.00 .3989 | 0.50 1915 | .3521 | 1.00 .3413 | 2420 
0.01 3989 | 0.51 1950 | .3503 | 1.01 -3438 | 2396 
0.02 3989 | 0.52 1985 | .3485 | 1.02 «3461 | .2371 
0.03 3988 | 0.53 -2019 | .3467 || 1,03 3485 | -2347 
0.04 3986 | 0.54 2054 | 3448 | 1.04 3508 | 2323 
0.05 3984 | 0.55 2088 | .3429 | 1.05 8531 | 2209 
0.06 3982 | 0.56 2123 | .3410 | 1.06 .3554 | .2275 
0.07 3980 | 0.57 2157 | .3391 | 1.07 .3577 | .2251 
0.08 -3977 | 0.58 2190 | .3372 | 1.08 3599 | .2227 
0.09 3973 | 0.59 .2224 | .3352 | 1.09 -3621 | -2203 
0.10 3970 | 0.60 2257 | .3332 | 1.10 3643 | .2179 
0.11 3965 Í 0.61 .2291 | -3312 | 1.11 .3665.| -2155 
0.12 3961 | 0.62 2324 | .3292 | 1.12 «3686 | .2131 
0.13 .3956 | 0.63 .2357 | .3271 | 1.13 -3708 | .2107 
0.14 .3951 | 0.64 2389 | .3251 | 1.14'| .3720 | -2083 
0.15 3945 | 0.65 2422 | .3230 | 1.15 -3749 | .2059 
0.16 -3939 0.66 2454 -3209 1.16 -8770 . 2036 
0.17 3932 | 0.67 2486 | .3187 | 1.17 .3790 | -2012 
0.18 3025 | 0.68 .2517 | .3166 | 1.18 «3810 | .1989 
19 -3918 | 0.69 2549 | .3144 | 1.19 .3830_| .1965 
20 -3910 0.70 2580 3123 1.20 -3849 .1942 
21 .3902 | 0.71 .2611 | .3101 1.21 .3869 | .1919 
22 3894 | 0.72 2642 | .3079 | 1.22 .3888 | .1895 
23 -3885 | 0.73 .2673 | .3056 | 1.23 .3907 | .1872 
24 -3876 0.74 2704 -3034 1,24 8925 .1849 
25 3867 0.75 2734 3011 1.25 8944 1826 
26 3857 0,76 2764 2989 1.26 «3962 -1804 
27 3847 0.77 2794 2966 1.27 -3980 .1781 
28 3836 0.78 2823 2943 1.28 .3997 .1758 
29 3825 0.79 2852 2920 1.29 -4015 1736 
2897 1,30 -4032 1714 
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Tapte XLIV.—NORMAL PROBABILITY INTEGRAL, ORIENTED IN TERMS OF 
2/0.—(Continued) 


aloe 50-4 : a/on alos 50-q z 
1.50 4882 | .1295 2.00 2.50 -4938 | .0175 
1.51 4346 | 1276 2.01 2.51 «4940 | 0171 
1.52 ABST 1257 2,02 2.52 A941 -0167 
1.53 +4370 | .1238 2.03 2,53 4943 0163 
1.54 -4382 | .1219 2.04 2.54 A945 «0158 
1.55 4394 1200 2.05 2.55 -4946 | 0154 
1.56 -4406 | 1182 2.06 2.56 4948 | 0151 
1.57 | 8| .1163 | 2.07 2.57 | .4949 | .0147 
1.58 | .4420] 1145 f 2.08 2.58 | .4951 | .0143 
1.59 | .4441] .1127 | 2.09 2.59 | .4952] .0139 
1.00 „4452 | .1100 2.10 2.60 -4953 0136. 
1.61 4463 | 1002 2.11 2.61 A955 0182 
1,62 ATA | 1074 2.12 2.62 -4956 | 0129 
1.63 4484 1057 2,13 2.63 -4957 «0126 
1.4% 4405] 1040 2.14 2.04 A959 0122 
1.65 „4505 | 1023 2.15 2.05 -4960 -0119 
1.66 4516 | 1006 2.10 2.06 4961 0116 
1.67 46256 | 0080 2.17 2.67 4962 -0113 
1.68 4535 | 0073 2.18 2.68 4903 +0110 
1,00 „4545 | 0057 2.19 2,69 4904 «0107 
1.70 „4554 | -0940 2.20 2.70 -4965 -0104 
LEAI ASO 0925 2.21 2.71 «4966 | 0101 
1.72 4573 0009 2.22 2.72 4967 0099 
1.73 4582 | 0805 2,23 2.73 4968 0096 
1.74 „4591 0878 2.24 2.74 4969 0093 
1.75 4500 0863 2.25 2.75 4970 0091 
1.76 4008 | 0848 2.26 2.76 A971 0088 
1.77 4616] 0833 2.27 2.77 4972 «0086 
1.78 4025 | 0818 2.28 2.78 -4973 0084 
1.79 4693 | 0804 2.29 2.79 -4074 «0081 
1.80 404 0790 2.30 2.80 A074 0079 
1.81 AAD | 0775 2.31 2.81 4975 0077 
1.82 4656 | 0761 2.32 2.82 4976 | .0075 
1.83 AO 0748 2.33 2.83 ANT 0073 
Ls AOTL | 0734 2.34 2.84 A977 | 0071 
1.85 A078 | 0721 2.35 2.85 4978 | .0060 
1.86 4686 | 0707 2.36 2.86 A979 0007 
1.87 4003 | 0004 2.37 2.87 4979 0065 
1.88 | .4000] 0081 | 2.38 2.88 | .4980] 0083 
1.89 | .4706] .0600 | 2.39 2.89 | .4981| 0081 
1.90 4713 | 0056 2.40 2.90 4981 0060 
1.91 4719 | 0044 2.41 2.91 4982 | .0058 
1,92 4726 | 0032 2.42 2,92 «4982 -0056 
1.93 4732 | .0020 2.43 2.93 4983 0055 
1% 4738 | .0808 2.44 2.94 4984 0053 
1.95 ATH | 0590 2.45 2.05 -4984 -0051 
1.96 AID | 0584 2.46 2.96 -4985 0050 
1.07 | 756| 0573 | 2.47 2.97 | .4985| 0048 
1.98 | .4761 | .0562 | 2.48 2.98 | .4986| 0047 
1.99 4707 | 0651 2.49 2.99 -4986 0046 + 
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Tase XLIV.—Normat PROBABILITY INTEGRAL, ORIENTED IN Terms oF 
2/a2.*—(Concluded) 


(Prefix in each column the digits indicated in parentheses.) 


alos 50-4 z alas 50-4 z 
(.49) (,00) (,000) 

3.00 8650 3.50 4.00 9683 1338 

3.01 8694 3.51 4.01 9696 1286 

3.02 8736 3.52 4.02 9709 1235 

3.03 8777 3.53 4.03 9721 1186 

3.04 8817 3.54 4.04 9733 1140 

3.05 8856 3.55 4.05 9744 1094 

3.06 8893 3.56 4,06 9755 1051 

3.07 8930 3.57 4.07 9765 1009 

3.08 8065 3.58 4.08 9775 0969 

3.09 8999 3.59 4.09 9784 0930 

3,10 9032 3.60 4. 

3.11 9065 3,61 4. 

3.12 9096 3.02 4. 

8.13 9126 3.63 4. 

3.14 9155 3.64 4. 

8.15 9184 8.65 4. 

3,16 9211 3.66 4. 

3.17 9238 3.67 4. 

3.18 9264 3,68 4. 

8.19 9289 3.69 4. 

8,20 9313 3.70 4. 

3.21 9336 3.71 A. 

8,22 9359 3.72 4. 

3.23 9381 3.73 4.24 

8.24 9402 3.74 4. 

3.25 9423 83.75 4 

3.26 9443 8.76 4. 

3,27 9462 8.77 4, 

3.28 9481 3,78 4. 

3.29 9499 3.79 4. 

8.30 9517 8.80 4. 

8.31 9634 3.81 4. 

3.32 9550 3.82 4. 

3.33 9566 3.83 4. 

3,34 9581 3,84 4. 

3.35 9596 3.85 4. 

3.36 9610 3.86 

8.87 9624 8.87 

3.38 9638 3.88 

3.39 9651 3.89 

83.40 9663 3.90 

8.41 9675 3.91 

3.42 9687 3.92 

3.43 9698 3.93 

3.44 9709 3.94 

83.45 9720 8.95 

3.46 9730 3.96 

3,47 9740 3.97 

3.48 9749 3.98 

3.49 | 9759 3.99 


* See footnote on p. 484. 
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Tarun XLV.—Tue DISTRIBUTION or Srupent’s Í 
(Percentage of the Total Area in Tail of Distribution from t ='a/ez to ©) 
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Taste XLV.—Tue DISTRIBUTION OF Srupent’s t.—(Continued) 
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Tasty XLV.—Tue DISTRIBUTION or STUDENT’S t.—(Continued) 


n=1| 2 3 4 5 6 7 8 9 10 
n'=2| 8 4 5 6 7 8 9 | 10 11 
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Tani XLV.—Tue DISTRIBUTION or STUDENT'S ¢.—(Continued) 
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Tanin XLV.—Tue Disrrmorion or STUDENT'S t.*—(Concluded) 


* This table, and Table XLVI, wore taken from Student's article, New Tables for Testing 
the Signifieance of Observations,” Metron, Vol. V, pp. 114-120, Student, whose real 
name was William Sealy Gomet, died in 1937. Woe were granted perminsion by his wife and 
heir, Mra. Marjory Gomot, to nse these tables, Wo have made a few corrections in this 
table which Btudent reported to Prof. Egon S. Pearson of the University of London and 
which Prof. Pearson passed on to us, In this table we subtracted the entries made by 
Student from 1.00 so aa to give the percentage in the tail of the distribution between £ = x/as 
and =. Tablo XLVI we used as it standa in Metron. 
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Taste XLVI.—Sropent’s TABLE ror CORRECTING PROBABIATIES 
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* For acknowledgment, see footnote to Table XLV. Use with entries in Table XLV: 
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ess’ | ges | sos’ | 968° | 898° | 898° | 628" eve’ | ste | ee" | esa" | Tax | s18" | Foe’ | ser | Tat” 991° | OFT" | LIT 
wre | sez" | GTZ | voz" | set" | est” | S4T° | LOT" | Zor” ust | Ist’ | wet" | 6£1° | OFT" | Zt" | STL” | OOT” 730° | 190° oF 
urs’ | ess’ | ste" | vos" | 868° | 448` | 896" | 898 19g" | gya" | 888° | ose" | Tae’ | ara" | 108° | 881° | BLT" 891 | ser” 
tse” | eez' | 92a" | IIe" | 20s", | OGT" | ZBL" | FLT” got’ | sor” | zo" | gt" | FHT” | 981° | Zor” QIL" | ¥OL* | 880" | #90" WW 
498° | oye" | ses’ | tre" | sos 488° | 318° | 298° | 198° | yea" | Lee” | ese" | Ise" owe" | 608° | 961° | O8T" | Bet” | Lar” 
sez: | ea" | eea° | tea" | Gow’ | zet* | o6t" | Ost" | szt | 691° | FAT" | SET” | OST” apt | eet’ | TZt° | 60r" | Z60° | 290° oy 
sos" | zee" | ors’ | gee’ | trs | eee" | 888° | 248° | 828° | 998" u9e° | eve” | ofa" | ose" | 818° | 908° | 881° | 991° | 881° 
i=) goz° | zz" | tea" | ze" | 6Ts" | sog` | 861° | 881° | 881° | ZT” TLT | F91 | 291° | ert’ | BET” | Szt* | FIT" | 960° | 690" oF 
A oss’ | es: | oes" | sss" | gee" | Gos" | oos° | 688° | BBB" | 918° | 898° 098° | 198° | ore" | 888° | FIs" | 967° ter" | ovr 
a gzz | 298° | Tez" | 98%" | 20a" | FIs” | GO% | 261° | 161° | 981° | 621° TZT’ | vot’ | gor’ | oer’ | FET’ | 6tt° | TOL’ | ¥L0° 8g 
[s] tes` | res | tos" | ree" | ses | ees" | ars’ | 108° | tes" | 888' | 088° tae’ | soe" | Toa" | ose" | tag’ | 90K" | SBT” air 
eA g8z' | cuz’ | zou" | Lez" | zea" | gzz’ | STZ" | GOs" | GGT” | L6T” 481° | ost’ | c2t" | sor" | sgt" | OFT” | Zt" | 901° | 820 g 9e 
Aa ges" | 348° | 98° | ses’ | oes" | gee | 918° | 808° | 008° | 868° yee’ | tza" | soe" | 198° | 988° | Lie | Ber" | 99T” 
| gsz° | gaz' | sez` | sta" | gee" | 92%" | STs" | 603° | GOs" | 46T * | oer’ | eet’ | tzt* | oot" | SFI" | set | ZIT" | 280° re 
ort: | ges’ | g4s° | 298° | 198° | Fre" | ees" | ees” | 918° 908° | sea’ | 888° | 948° | 798° | ste" | Gee" | Sos” | 991" 
2 46g’ | ¥8%° | 693° | esz° | FRG | LES" | 9gz` | OBS" | etz" | LOB" oz | zor" | 181° | 691° | Zot” | OFT’ | BIT” | 280" ee 
a soy’ | ges" | seg’ | 998° | 998° | 98° | 888° | Iss" | 888° tis" | sos: | sea’ | 828° | soa" | Bre" | 978° | 240° 
pa gec° | esz’ | tzg' | gc" | 6s" | Lea" | Tes" | Gos" | STS" | tiz" goz’ | 16k" | GZ" | 991° | GFT" | LUT” | £60" og 
g grt | sor’ | tee" | tze | 998° | Yes" | Fs" | ess" | Oss | Bes" | BIS” 008° | 98% | 698° | Be" | Bes" | OBI” 
a zoe’ | oez: | 222° | coz’ | ez" | re" | Gea" | seo" | +z’ | ZT" | 802° | 261° | SBT" | TZT: est* | Tt" | 960° 6z 
R lay: | ert’ | oof | ses* | gas" | sos* | ges" | sts° | ess” | Tes” | Tes" | 60s” g6g` | 448° | £98" | Gee" | 981° 
ore: | 962° | 98a" | Ize" | tos" | rez’ | see" | SEs" | Tes" | za" Iz’ | soc" | 161° | 941° | 6ST” | SET" | 660° 8 
os OF oe ve 0G 91 #1 6L u or 6 8 4 9 g + e z t IN: 
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APPENDIX 


610° 
Zo" 
sto" 
ogo" 
980° 
890° 
err 
20° 
Ter 
880° 
69r 
601 
eert 
oer” 
918: 
OFT” 
48g 
esT" 
gtg: 
ot" 
698" 
621° 
dhë 


910° 
oto’ 
980° 
¥20" 
140° 
970° 
860° 
190° 
ore 
310° 
ver" 
880° 
tor 
601 
yer’ 
bra a 
ger 
TET 
608° 
OFT” 
986° 
est” 
are” 


09 


sg 
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STATISTICAL PROCEDURES 


Taste XLVIU.—Vauvues or P ror THB CHI-SQUARE Trust OF GOODNESS 
or Fir 


neg 
n=9 


n= 9 
n = 10 


n = 10 
nell 


1| 606531 
2| .367879 
3| 223130 
Al .135335) 
5| .082085) 


6| .049787 
7| .030197 
8| .018316 
9| .011109 
0| .006738 


11| .004087 
12| .002479) 
13| .001503 
14| .000912) 
15| .000553) 


16} .000335) 
17| .000203 
18| .000123, 
19| .000075 
20| .000045) 


21| .000028) 
22| .000017 
23| .000010 
24| .000006 
25| .000004| 


26| .000002) 
27| 000001 
28| .000001 
29) .000001 
30| .000000 


40| .000000) 
50| .000000 
60| .000000) 
70| .000000) 


-801253 
~572407 
+391625 
261464) 
171797 


-111610} 
-071897 
-046012 
„029291 
-018566 


«011726 
007383) 
004637 
002905, 
001817 


-001134 
.000707 
000440) 
000273 
-000170) 


000105, 
. 000065) 
000040] 
000025 
«000016 


.000010| . 
000006) . 
«000004 
.000002| .! 
-000001) . 


-000000) . 
.000000) . 
-000000) . 


909796) 
735759 
-557825) 
406006 
287298 


199148) 
-135888 
091578) 
-061099 
040428) 


026564) 
.017351 
.011276) 
007295) 
004701 


-003019 
-001933 
001234 
-000786} 
-000499 


«000317 
.000200 
000127 
-000080) 
000050 


«000000 


962566 
849146] 
699986 
- 549416) 
~415880} 


-306219) 
. 220640) 
156236) 
109064! 
075235] 


-051380) 
«034787 
023379] 
-015609 
-010363 


.006844) 
-004500 
-002947) 
-001922) 
-001250) 


-000810 
-000524) 
-000338 
«000217 
.000139 


000090 
«000057 
«000037. 

000023; 
000015] 


-000000) 
-000000 
.000000 


-000000) 


- 676676) 
-543813 


.423190) 
-320847 
-238103 
.173578]| 
„124652 


.088376] 
.061969/ 
-043036) 
-029636 
020256 


013754, 
-009283 
-006232 
„004164 
-002769 


-001835 
-001211 
000796, 
000522 
-000341 


000223 
-000145 
-000094 
-000061 
-000039 


«000001 
-000000 
000000; 
-000000 


994829) 
959840 
885002 
. 779778) 
659963, 


539750} 
428880 
332594] 
252656 
188573 


+ 138619 
100558} 
-072109 
051181 
.036000 


025116) 
017396 
011970} 
-008187 
.005570) 


003770 
002541 
-001705) 
001139 
-000759 


000504] 
«000333 
000220 
000145 
000095 


, 000001 
000000 
-000000 
000000 


-998249) 
-981012 
934357 
- 857123 
- 757576 


-647232 
«536632 
433470] 
«342296| 
265026] 


-201699 
151204] 
«111850 
-081765 
059145 


042380] 
-030109 
021226 
«014860 
-010336 


007147 
004916 
003364 
002292 
-001554 


-001050 
-000707 
-000474| 
«000317 
-000211 


«000003 
-000000) 
000000; 
000000 


999438] 
-991468) 
-964295 
.911413 
- 834308 


-739919 
-637119 
534146) 
437274 
~350485 


„275709 
213308] 
-162607 
. 122325 
-090937 


-066881 
-048716) 
-035174 
025193) 
-017913 


-012650 
-008880 
-006197 
-004301 
002971 


«002043 
001399 
000954 


-000648| .! 


-000439 


-000008 


-000000) . 


-000000) 
«000000; 


-999828 
996340 
981424 
„947347 
. 891178 


815263, 
-725444 
-628837 
«532104 
-440493 


357518 
-285057 
-223672 
„172992 
132061 


099632, 
074364, 
054964 
040263 
029253 


021093 
015105 
.010747 
-007600 
-005345 


003740 
002604 


Nore: This is the Elderton table, taken from Pearson’s Tables for Statis- 
ticians and Biometricians by arrangement with the publishers. 


APPENDIX 499 


Taste XLVIII.—Vatves or P ror THE Cursquars Test cr GOODNESS 
oF Frr.—(Continued) 
=n 


a|n Sil|n =12|n =13|n =14|n =15|n =16|n = 17 n =18|n =19 
X |n’ = 12| n = 13 |n = 14|n = 15|n = 16| n’ = 17 |n = 18| n’ = 19 |n’ = 20 


1| .999950| .999986] .999997| .999999] 1. 1. 1, 1. 1. 
2| .998496] .999406] .999774| .999917] .999970] .999990) .999997| .999999) 1. 

3| .990726| .995544| .997934| .999074| .999598] .999830| .999931| .999972) .999989 
4| .969917| .983436] .991191| .995466] .997737| .998903] .999483) .999763] .999894 
5| .931167| .957979| .975193| .985813] .992127) .995754| .997771) .998860) .999431 


6| .873365| .916082| .946153] .966491| .979749] .988095| .993187) .996197| .997929 
7| .799073| .857613| .902151] .934711| .957650) .973260] .983549] .990125) ,994213 
8| .713340| .785131| .843601] .889327| .923783| .948867| .966547) .978637| .986671 
9| .621892| .702931| .772943| .831051| .877517| .913414) .940261) .959743) .973479 
o| .530387| .615960| .693934| .762183] .819739] .866628| .903610) .931906) .952946 


11| .443263] .528919| .610817| .686036] .752594) .809485| .856564) .894357| ,923839 
12| .362642| .445680| .527643| .606303] .679028| .743980| .800136) .847327| .885624 
13| .293326| .369041| .447812| .526524) .602298| .672758| .736186) .791573) . 838571 
14| .232093| .300708| .373844| .449711| .525529| .598714) .667102 .729091) .783691 
15| .182498| .241436| .307354| .378154| .451418| .524638) .595482) .661967| .722598 


16| .141130| .191236| .249129| .313374| .382051| .452961| .523834) .592547| 657277 
17| 107876} .149597| .199304| .256178] .318864| .385597| .454366) .523105) . 589868 
18| .081581| .115691| .157520| .206781| .262666| .323897| .388841) .455653 -522438 
19| .061094| .088529| .123104| .164949| .213734| .268663) .328532 .391823| . 456836 
20| .045341| .067086| .095210| .130141| .171932| .220220| .274229| . 332819| .394578 


21| .033371| .050380| .072929| .101632| .136830| .178510| .226291| .279413 -336801 
22| .024374| .037520| .055362| .078614| .107804| .143191| .184719| .231985| .284256 
23| .017676| .027726| .041677| .060270| .084140| .113735| .149251) . 190590| .237342 
24| .012733| .020341| .031130| .045822| .065093| .089504| .119435| . 155028| .196152 
25| .009117| .014822| .023084| .034566| .049943| .069824| .094710| .124915| .160542 


26| .006490| .010734| .017001| .025887| .038023| .054028| .074461| .099758| .130189 
27| .004595| .007727| .012441| .019254| .028736| .041483 .058068| .078995| .104653 
28| .003238| .005532| .009050| .014228| .021569| .031620|, .044938) .062055| .083428 
29| 002270] .003940| .006546| .010450| .016085| .023936| .034526| .048379| .065985 
30| 001585] .002792| .004710| .007632| .011921| .018002| .026345| .037446| .051798 


4o! .000036| .000072| .000138| .000255| .000453| .000778| .001294| .002087| .003272 
50| .000001} .000001| .000003| .000006| .000012) .000023 .000042| .000075| .000131 
60| .000000| .000000| .000000| .000000| .000000| .000001 .000001| .000002| .000004 
70| .000000| .000000| .000000| .000000| .000000| .000000| .000000) .000000 -000006 


500 STATISTICAL PROCEDURES 


Tanne XLVIII.—VALUES or P ror THE CHI-SQUARE Test oF Goopness 
(Concluded) 


Ol = 21n’ = 22ln! = 23|n? = 24|n' = 25n’ = 26n’ = 27in! = 28) 


= Win’ = 30 


1 1. i. 1. 1. L 1. ae 

1. k. 1. 1. 1. 1. T Ty 
.999996|.999998|.999999| 1. 1. J: 1. 1. 
999954] 999980) 999992) .999997|.999999) 1. ak Ty 
999722] .999868| .999939| 999972] .999987) 999994) 999998 -999999) 


eee 


oR ee 
Bee 


998808] .999427| 999708] . 999855| .999929) . 999966. 999984] .999993| 999997] , 999999 
,996685|.998142|.998980| .999452| .9997 11|. 999851|.999924| .999962| .999981| .999991 
.991868|.995143|.997160| -99837 1| .999085| .999494| . 999726| -999853 .999924| .999960 
982907] .989214| .993331|.995957) 997595] .998596] .999194| .999546) .999748] 999863 
968171] .978912| 986304) 991277] .994547| .996653] 997981] .998803} .999302) .999599 


Seoase 


11 |.946223] 962787] .974749] .983189] .989012| 992946) 995549] 997239] .998315) .998988 
12 }.916076),939617|.957379| .970470| .979908| .986567| 991173] .994294| .996372) 997728 
18 |, 877384] 908624) .933161| 951990) .966121].976501|.983974) .989247) 992900} 995384 
14 |, 830496] .869599] 901479] .926871| .946650| 961732] 973000} .981254| .987189] 991377 
15 |.776408) 822952] .862238] .894634| .920759| 941383] .957334] .960432) 978436) 985015 


16 |. 716624]. 769650], 815886] .855268] . 888076] .914828] .936203] 952947] .965819|. 975536 
17 |.652074).711106| .763362| 809251) .848662} .881793|.909083] .931122| .948589).962181 
18 |. 587408] . 649004] .705988] .757489] . 803008) . 842390) .875773) .903519| 926149] .944272 
19 |..521826].585140|. 645328] .701224] .751990) .797120) .836430) .870001| 898136) . 921288 
20 |,457930] 521261) 583040] . 641912) .696776] .746825] .791556) .830756| .864464) .892027 


21 |.397132} . 458944) . 520738) . 581087) . 638725) 692609) 741964) . 786288) 825349) . 859149 
22 |. 340511) .399510} .459889} 5620252) 579267) 635744) . 688697] .737377| .781291| , 820189 
23 |.288795| .343979| 401730] .460771| 519798) 577564) . 632947 | .685013) .733041| 776543 
24 | 242892) 293058} .347229) .403808] .461597] 519373) .575965) .630316) .681535) 728932 
26 |.201431) 247164) 207075] .350285) .405760| .462373) 518975) . 574462) 627835) . 678248 


26 |. 165812). 206449} , 251682) .300866) .353165) .407598) .463105) 518600] 573045) 625491 
27 | 135264). 170853] .211226) 255967| .304453) .355884) 409333) .463794) 518247), 571705 
28 peeh EPY .175681|. 215781|. 260040| .307853| .358458| .410973| .464447| 517913 
29 |.087759|. 114002|.144861|. 180310| . 220131| . 263916] .311082| .360899| .412528| 465066 
30 |.000854|.091988|. 118464) . 149402) . 184752| . 224289) .267611| .314154| .363218| . 414004 


40 |.004995|.007437| .010812|.015369| .021387| . 029164| 039012) ,051237| .066128| .083937 
50 |-000221|.000365|.000586|.000921|.001416|.002131| .003144| ,004551| .006467| .009032 
60 |.000007|. 000013], 000022| . 000038| .000064) .000104) .000168| . 000264] ,000407| 000618 
70 | .000000) .000000) .000001) .000001| . 000002] . 000004| .000007) .000011| . 000019] 000030 


APPENDIX 501 
Taste XLIX.—COEFFICIENTS OF 7’s IN THE TETRACHORIC CORRELATION 
Serres 
Coefficient of r’s according to percentage in tail of distribution 
Tn 
5% 6% 7% 8% 9% 10% | 11% | 12% Tin 

r2? | 1.1631| 1.0994| 1.0436| .9936| .9841| .9062| .8673| .8309| .7965 
te 6963} .5786| .4809| .3977| .3256| .2622) .2059} .1554) .1097 
r4 |—.0989| —.1849] —.2476| —.2942| —.38291| —.3552| —.3745] —.3884| —.3981 
r5 |—.5898| —.5167| —.4860| —.4517| —.4157| —.3795| —.3435| —.3083| —.2741 
re |—.2903| —.1929| —.1120| —.0442| .0127) .0608) .1014/ .1857) .1646 
r .2360| .2853| .3125| .3250| .3272| .3222| .3121| .2982| .2816 
P .3640| .3115| .2529|. .1969| .1449| .0972| .0540) .0150|—.0199 
P .0082| —.0739| —.1334| —.1759| —.2052| —.2243| —.2354| —.2401| — 2398 
710 | —.3077| —.2995| —.2755| —.2442| —.2092| —.1729| —.1368| —.1019| — .0686 
r11 | —.1597|—.0768|—.0081| .0048| .0914| .1256| .1513| .1699) .1824 
ra| 1921) .2258] .2363| .2318| .2175| .1970| .1727| .1463) .1191 
ris} 22821 .1650| .1039| .0485| .0002| —.0407| —.0745| —.1020| —.1234 
74 | —,0705| —.1323] —.1693| —.1880] —.1933| —.1891| —.1784 —.1623| —.1431 
ris | — 2347] —.2011| —.1577| —.1117| —.0672| —.0261) .0105) .0423) .0691 
rs |—.0328| .0414| .0948| 1306) .1522| .1626) .1641| .1590) .1488 
rt! 2004} .1985| .1773| .1461| .1106] .0743) .0392| .0068} —.0222 
rs | 1077] .0349| —.0250| —.0711] —.1043] —.1265| —.1388] —.1435| —.1420 
r» | —,1436| —.1701| —.1715| —.1572| —.1837| —.1054) —.0751| —.0450| ~.0162 
r» |—,1522|—,0913|—.0335| .0162| .0562| .0864) .1075| .1207| .1270 
r| .0785| .1267| .1482| .1507] .1404| .1219) .0984] .0726 0463 
.0778) —.0471| —.0743| —.0941| —.1071 
—.1144 —.1263; — 0684 
073 .0110) 0845 
0758) 1215 .0832 


Nors: This table was made by H. P. Peters. 


502 STATISTICAL PROCEDURES 


Tanne XLIX.—COEFFICIENTS oF r’s IN THE TBTRACHORIC CORRELATION 
Serws.— (Continued) 


Sa Se 


Coefficient of r’s according to percentage in tail of distribution 


Tr 


14% | 15% | 16% | 17% | 18% | 19% 20% | 21% | 22% 


a 7639| .7329| .7032| .6747| .6473| .6208| .5951| .5703| .5461 
r? 0682} .0303] —.0045| —.0366| —.0662| —.0936| —.1191| — -1428| —.1648 
rt |—.4042| —.4074| —.4082| —.4070| —.4040| —.3995| —.8937| — -3868| —.3789 
r5 |—.2424|—.2092| —.1785| —.1491| —.1210) —.0941| —.0683) —.0437| —.0203 
9: (1883) .2090] .2256| .2391| .2498| .2580| .2640) .2681| .2703 


rr 2659] .2433| .2088| .2013] .1798| .1582| .1367| .1154) .0945 
13 | —.0494| —.0785| —.1027| —.1238| —.1421] —.1578| —.1710) —.1820) —.1909 
79. |—.2398| —.2278| —.2176] —.2054| —.1917| —.1767| —. 1607| —.1442) —.1271 
ro |—.0403|—.0085| .0181| .0425] .0644) .0840| .1014| .1167) .1300 
rt] 1956} .1928} .1921| .1884) .1822) .1738| .1636] .1520) .1393 


r32| 0961| .0651} .0394] .0150) —.0079| —.0291| —.0485| —.0662| —.0882 
r33 |—.1470| —,1511| —.1571| —.1620| —.1625] —.1602| —.1554| —.1487| —.1402 
r14 | —,1279] —.0997| —.0771| —.0547| —.0328]—.0118|—.0082) .0268] .0440 
rs} 1001) .1089} .1223| .1319| .1380) .1410) .1412| .1390| .1346 
r| 1426] .1183| .1001) .0809| .0612) .0415) .0223) .0038) —.0138 


rit |—,0581| —.0693] —.0871| —.1013] —.1120] —.1194) — 1239] —.1257| —.1208 
r38 | —.1447| —.1252| —.1120) —.0967| —.0801| —.0627|—.0450| — .0274| —.0101 
r| 0223} .0339} .0545) .0719) .0861) .0972) .1052) .1105) .1131 
r% | 1395| .1234) .1156} .1047| .0916] .0770) .0614) .0452| .0289 
72) ,0067| —.0035| —.0255| — 0449} —.0615| —.0729] —.0863] —.0945| — 1.000 


722 | — 1282) —.1156] —.1135) —.1116] —.0971| —.0857| —.0726) — .0583| —.0434 
r23 | —,0291)—.0217) .0004) .0207| .0389/ .0546) .0678) .0784) .0864 
r% | 1187) .1037/ .1058| .1038/ .0983) .0901| .0796| .0675) .0542 


7% | 0450) .0419| .0207| .0004) —.0185| — .0355| —.0503| — .0627| —.0728 
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Taste XLIX.—COEFFICIENTS oF r’s IN THE TeTRACHORIC CORRELATION 
Serres.— (Continued) 


Coefficient of r’s according to percentage in tail of distribution 


23% | 24% | 25% | 26% | 27% | 28% | 29% | 30% | 31% 


r .5225) .4995| .4770) .4549| .4333| .4122| .3913) .3708| .3506 
r3 |—.1854| —.2046| — .2225| — .2393| — .2549| — .2696| —.2832| — .2960| —.3079 
rt |—.3701| —.3606| —.3504| —.3396| —.3283| —.3165| —.3043| —.2917| —.2788 
ri .0022| .0233| .0436| .0628| .0810/ .0983| .1147| .1301| .1447 
w .2709| .2701| .2679| .2645| .2600| .2545| .2481| .2409) .2329 


bed .0737| .0545| .0347} .0159] —.0023] —.0198| —.0366) —.0527| —.0680 
78 |—.1979] —.2030| —.2065] —.2085] —.2090| —.2082) —.2061| —.2029| —.1986 
r9 |—,1093| —.0873| —.0750| —.0578| —.0408| —.0241|—.0078) .0080| .0233 
rio} 1412] .1506| .1581| .1641| .1683| .1711| .1724) .1724) 1711 
pu} 1249] .1114| .0965| .0814| .0661) .0508) .0355) .0204) .0056 


rt | —,0961] —.1083| —.1189| —.1277] —.1348] —.1404| —.1444| —.1470) —.1482 
ris | —.1294| —.1193| —.1073] —.0945] — .081 1| —.0674| —.0534| —.0394| — .0253 
ra| .0597| .0739| .0864| .0973] .1066] .1147| .1206] 1253) .1288 
rs | 1281} .1205] .1113| .0962} .0854) .0740} .0621) .0498) .0373 
7's |—.0303| —.0455| —.0593| —.0717| —.0826] —.0920] —.0999] —.1063) —.1112 


rt |—.1208| —.1174| —.1109| —.1030] —.0938| —.0836) —.0727| —.0611| —.0490 
rs | 0064] .0220} 0366] .0500} .0620| .0727| .0819} .0897| .0960 
re | .1134| 1115] .1076| .1020} .0950} .0866| .0772| .0669| .0560 
7% | 0128| —.0027| —.0176| —.0315] —.0443| —.0558] —.0660) —.0750} —.0824 
1» |—.1031| —.1038] —.1026| —.0990] —.0940| —.0874| —.0796) —.0706} —.0608 


r2 |—,0281|—.0131| .0016] .0157} .0389| .0411| .0521} .0619 0703 
0919] .0945| .0958| .0945| .0914) .0866) .0803 .0727| .0640 
.0402| .0286| .0117| —.0023) —.0156| —.0333) —.0397 —.0502| —.0593 

—.0803| — .0855| —.0884| —.0891| —.0878 —.0846| —.0798 —.0735| —.0660 
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Tani XLIX.—COEFFICIENTS OF 7’s IN THE TETRACHORIC CORRELATION 
Sers.— (Concluded) 


Coefficient of r’s according to percentage in tail of distribution 


32% | 33% | 34% | 85% | 36% | 37% 38% | 39% | 40% 


r? 3307) .3111| .2917| :2725| .2535| .2347| .2160| .1975 1792 
r3 |~.3189| —.3292| —.3388| —.3476| —.3558) —.3633 —.3702| —.3764| —.3820 
ri |—.2655| ~.2520| —.2383] —.2243| —.2101| —.1958 —.1813| —.1666| —.1518 
r 1584] .1713| .1833| .1946| .2050) .2146| .2285| .2317| .2391 
r (2242) .2148| .2049| .1944| .1834| .1720) .1602| .1481 .1356 


r! |—.0826| —.0964| —.1095| —.1218| —.1333| —.1440| —.1540| —.1631 —.1715 
73 |—.1934] —.1872| —.1802| —.1725| —.1640] —.1548| —.1451| —.1348) —.1241 
tod 0380] .0521| .0655| .0783| .0904| 1017) .1122) .1220; .1310 
ro | .1687| .1651| .1605| .1550) .1485] .1413} .1382| .1245| .1151 
r11 | —.0088| —.0228] —.0363| —.0492] —.0615| —.0731| — .0840| —.0942| —.1036 


yt | —.1480| —.1466] —.1440] —.1404| —.1356| —.1299] —.1234| —.1159| —.1078 
rt |—.0114| .0022} .0155| .0283] .0407| .0524) .0636) .0740) .0837 
z| .1308} .1310} .1302| .1278] .1245) .1202| .1153) .1087| .1015 
r | 0248] 0123] —.0001| —.0121] —.0238| —.0350} —.0457| —.0558) —.0652 
r | —.1147| —.1167| —.1173| —.1167| —.1148] —.1117| —.1075| —.1023| —.0961 


rt | —.0366] —.0242| —.0117| .0007| .0127} .0244) .0357| .0463) .0563 
r| .1008| .1042| 1062} .1068| .1061) .1041| .1009) .0966) .0913 
7? | 0445] .0327| .0208| .0088|—.0030| —.0145] —.0257| —.0364| — .0465 
7% |—.0884| —.0930| —.0961] —.0978| —.0982| —.0972| —.0949| —.091 5| —.0869 
7° | —.0503| —.0398| —.0279] —.0164|—,0049| 0064) .0175) .0282) .0383 


72| 0773 


.0870| .0897| .0910} .0909) .0895) .0867| .0829 


7| 0545) .0336| .0225| .0114| .0003| —.0106) —.0212| — .0314 
7*4 | — 0672 —.0786) —.0822 —.0844| — 0825| —.0793 
—.0574 —.0380) —.0275 —.0048} .0153) .0254 
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APPENDIX 


ASE DF SS % BS % 1S % 0% %6t ASBI %LI %91 AST 


uopnquysIp Jo pe} ur eBezUe0Iad 0} Surpsosoe (4 JaMod 4919) 4 Su~puodsaiz0g, 


e—a 
A ° oi z 21999" | sos9> | z279” | Beo: | ogzo’ | 04" 
à 5 : : 0299" | Z679: | 6079" | 8189: | 6189" | 69° 
k $ a fi zero” | Z179; | sego: | gezo: | L919: |89 
a Zig" | zre9' | 9929: | 7819: | +609: |29; 
z E tee’ | +9z9° | z6to' | FIT” | Bz09" | 99° 
6rco" | 9819: | 2119" | z709: | T969: |99; 
9919' | 9019: | 0709: | 696g" | zess: | 79° 
1809° | #209" | 2969" | 9689" | Tess’ |897 
geeg” | rec’ | essc | etse. | etle | ZO" 
016g" | 698g" | £089" | ZrLg" | 9299" | 19° 
gz89° | +229" | zzz9' | +999" | TO9S" |097 
n GELS’ | 689G" | Begg" | S899" | SZG” |697 
[=] 999" | go9s* | ges: | FOSS" | B¥Fg" |897 
eA zgg¢" | 9T9G° | ZLO" | Sere" | B9E9" | Zo" 
is) ove" | gz¥s" | 2889" | treg" | O6za" | 997 
A gues" | ores: | Toes | zeze: | Goze |gs: 
a a89" | 1929" | PIGS: | £219: | 819: | Fo" 
Q gers’ | t919; | ogis: | 8809" | eros: |es: 
Q Tor’ | T2097 | 808: | Zoos: | Z967 | zo" 
Fa soog" | O86y" | 6FGF" | ST6F" | 4287" | TS" 
crer: | s887: | eeg}: | szs7> | zet’ [09° 
4 Isr’ | 9627" | 692+" | BELF" | 9027" | BF" 
Lele" | POLE’ | BLOF" | I99F" | 619+" | 8h" 
S gg9r' | 1197: | 2897: | T997: | zesr: |27: 
a sesh’ | BISH” | 96FF" | LFP | PFPP” | OF" 
a erry’ | F2FF" | LOFF’ | OSH" | Goer’ | oF" 
I Lyer. | oeer: | oter: | 68z¥" | o9gr: | FF" 
< eager" | Seer" | LIGE” | 861P" | OIF" | EF" 
a gery | ortt. | Pett: | Gory: |9807: | zr" 
n 6S0F` | SOF" | O£0F" | E10F* | PEGE" | IF” 
g96e: | ogee" | seeg: | ozes: | zoss: | OF: 
gogg: | ese" | Irs: | 928g. | OTE" |687 
6928" | 89218: | 9FLE" | ZELS | LILE” |887 
T298" | 199g" | Ogge" | 8E9g' | F398: |28" 
yuse" | s9ge° | cece | greg" | rece: |987 
ape | osz | %ee | %ts | %07 | %61 | Bet | zt | %91 | %st | Bt | %er 
ks uonnquysrp Jo Pre} UT eFezUedIed 0} Buyps000w (+ oMod yxy) A Burpuodsə109 
B (ponuyuog)—'SISSVIQ AVAAASAAIM NI 8,4 OWOHOVULS, OL DNIGNOASHEHOD) 8,1 uasog Isulg—"] wavy, 
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$18" $ | yes: | 308: | zee: | ozz: | zez: | yes: | pu: 
žge: | 896: | sre: | zee: | p16" | seg:-| 828: | 098: | zrs: | ss: | os: | 282: | ozz: | zez: | Fee: | ytz: 
986° 496° 86` Ze6' #16 68" LL8° 6S8" G8" 778 +08' 282° OLL- ZSL PEL” FIL" 
86: | 996° | s76: | Tee: | ere: | yes: | 228: | ess: | Irs: | 28° | gos: | zsz: | 694: | zoz: | vez: | etz: 
86° 496° L¥6" 186° £16" 68° 928° 898` 187 £28" £08" 282° 694° esL’ rel” SIZ" 
626: | 296° | ore: | ose: | zre: | ses: | 928: | sss: | 178: | ezs: | gos’ | osz: | 692: | gez: | ses: | ew: 
$16" 896° gre: 826° 116" 268" $18° ses" Ts" £28" B08" S82" 694" geL’ vel" gIrL’ 
026" g96' OF6" 26° 606° 168° #18" 4S8" OFS" £28" 208° 784" 992° ISL FEL" SIL" 
$96" os6' Le6' 126" $06" 683° ZL8" ggg" 828" k44: 108° sk" 294° IsZ" EL" erz: 
6s6' gř6' e6" 916° 006° 988° 698° oss" 9€8" 028" 008° 182" 992" osz’ 


el" erz 
g8}6' | E886" | GIZ6" | #206' | 6z68" | 1848: | 0£98: | 9478: | 4IZ8' | OTS" | 1662' | 1384) | TeOL* | GLFL* | toez’ | OZIZ" 
6886" | 8926" | GET6" | GOOG" | 8988" | 8348" | #898: | VES" | FSB" | SFIB™ | 6964° | Z844: | OF94: | GOFL" | #634" | PITZ: 
£626" | 62416" | 4906" | Le6s" | 1088° | S998" | Sees" | 06£8' | FRZS" | LOIS” | T64: | 9944: | TZ9L* | PSPL’ | Z834 | FOTA” 
£616' | 9806° | T2468" | ZS88° | 028° | £098" | €4}8° | Sees" | 6618' | 2908" | 6064: | OFLL* | 2694° | HEFL | B9TL° | Z60L" 
0606° | 0668" | 1888° | 6948" | #998° | Fees" | OLFS* | 1838° | SFIB™ | 6908° | 8984" | 8044* | 8994° | OLFL* | 9FZL* | S204" 
9868" | 1688" | 6848" | e898" | F498" | 098° | SHES | OTR" | S608" | T464" | #Z84° | T294' | SE94° | Z884: | SazL* | Se0L* 
6288" | 0628" | F698" | FESS" | 698° | ESES’ | TAZS* | FETS" | SEOR’ | VIL" | OLAL" | O94: | LOFL* | FEL" | FEIL" | zegoz’ 
TAL8` | 4898" | 4698" | GOSS" | FOFS" | GOES" | 9618: | S808" | 0264" | LS84° | SLL" | #894" | OSFL* | ETEL’ | SIL | S002" 
Z998' | G98" | 2678" | 80F8' | OTS" | GIGS" | SIIS* | E108" | 064° | S644 | LOL" | OSL | OTFL* | ZAZA: | LTL’ | F169" 
escg" | €278" | 4688" | Iss" | Szzs° | SIs" | 8£08° | 864° | SeBL" | OFLL" | 8O9L* | T8FL* | ZHEL™ | 62ZL~ 


6804" | 0F69" 

Irs" | O48" | F628" | STZS" | SEIS" | 9FOS" | 9964: | 0984' | 0924: | ZOOL" | GFL" | SsFL* | GOEL” | 1814: | 2ZFOL" | £069" 

24288" | 298" | T618: | 9118' | 8808" | 9S6L" | OL8L" | GLLL" | S894: | 16SL" | O8FL* | S984° | FSZL" | TIL" | TOOL’ | E989" 

218° | #918: | 24808" | 9108' | ZFEL* | F9BL" | E8LL" | 2694" | 2094: | LIGL: | StFL* | GOEL" | 9612; | 8204: | e969: | 03789° 

> | SPOS" | Z86Z* | ST6L* | SFBL* | SLLL" | FEOL* | ELOL" | LSL | SFPA’ | SHEL" | LEZL" | GETL* | SZOL* | ZO69" | FLL9° 
G66L° | 986L° | 9484" | I8L* | BFLL" | LL9L° | POOL” | Logs” | OFFL” | POEL* | GGL" | OZIL” | TLOL* | F969" 


6F89° | 9219" 

6284" | 984° | OLLL” | OTZL" | SPOL* | Z894* | EISL" | GEFs" | GOEL" | F8ZL" | POIL" | 0012: | 9004" | GOGO" | Z629: | #199" 
9944" | OTLL" | E99L" | LOL" | SFSL: | O8FL" | OSL" | OSEL* | LLZL: | £034: | LITA: | L202" | 8£69' | GEO" | FEL9* | 0299" 
© | 9092" | 9SGZ" | EOSL” | AZYL’ | B8EL" | IZEL" | O9TL" | I6TL" | Tes" | GEOL" | #969: | 8989" | FLL9" | EL99" | +999° 
OFSA’ | 96hL" | BFRL” | GOEL" | OFEL* | O6ZL" | TEZL"° | BOIL” | SOIL" | 9EOL" | 6969" | 8489" | 9629" | 9029° 


0199" | 9099° 
S82FL" | O8EL" | TEL” | FEL" | PZL | SEIL” | VEIL” | 2L0L" | FIOL” | 1969° | 4289" | 0089" | ZzL9° | 499° | SFeO" | SFFO’ 
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asim LI.—Tuz PREDICTED Location or AN INDIVIDUAL IN A DEPENDENT 
MEASUREMENT FROM His STANDING IN AN INDEPENDENT ONE 
r = .05 


Tenth dependent variable 
Tenth 


independent 


I 960 902 833 754 664 565 453 328 184 
Il 944 872 791 702 605 502 392 274 145 
Ill 933 852 763 669 569 465 357 244 125 
Iv 923 834 739 641 540 436 329 221 110 
v 912 816 717 616 513 409 305 201 98 
NA 902 799 695 591 487 384 283 184 88 
VII 890 779 671 564 460 359 261 166 77 
VIII 875 756 643 535 431 331 237 148 67 
IX 855 726 608 498 395 298 209 128 56 


Nore: Tenths in the independent factor are indicated by the Roman 
numerals at the left while tenths in the dependent one are at the tops of 
columns. Example for reading the table: Skeletal development and intelli- 
gence test scores are correlated to the extent of +.05. A boy stands in the 
third tenth in skeletal development (theoretically at the mid-point of that 
tenth). The chances are 106 in 1,000 that he will be found in the highest 
tenth in intelligence (no lower than its lower border); they are 209 in 1,000 
that he will be in the second tenth or higher; 312 in 1,000 that he will be in 
the third tenth or better; ete. 


This table was made by Richard P. T. Scott. 


APPENDIX 509 


Taste LI.—THE PREDICTED LOCATION OF AN INDIVIDUAL In A DEPENDENT 
MEASUREMENT FROM His STANDING IN AN InpEPENDENT OnE.—(Continued) 


r = .35 
Tenth dependent variable 
Tenth 
independent 9 8 7 e` 5 4 3 2 1 
I ove | 935 | sso | si2 | 731 | 635 | 522 | 388 | 226 
II aso | 901 | s28 | 745 | 651 | 546 | 431 | 805 | 164 
Il 947 875 792 699 599 493 379 259 132 


759 661 557 450 339 225 110 
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TABLE LI.—Tue PREDICTED LOCATION oF AN INDIVIDUAL IN A DEPENDENT 
MEASUREMENT FROM His STANDING IN AN INDEPENDENT ONE.— (Concluded) 
r = .65 


Tenth dependent variable 


Tenth 
independent 9 8 7 b 5 í 3 2 1 


INDEX 


A 


Alienation, coefficient of, 115 
Allport, ¥. H., 82 
Amos, C. E., 380 
Arithmetic mean, 41 
Attenuation, correcting for, 203 
Average deviation, 63-67, 81 
from the mean, 64-66 
from the median, 66-67 
Average intercorrelation, 196-201 
from ranks, 200-201 
Averages, correlation between, 193- 
196 


B 


Beagle, B. M., 457 
Beta coefficients, 223, 237 
Biserial correlation, 362-366 
formulas for, 364, 385 
standard error of, 365 
from widespread classes, 384-391 
standard error of, 389-391 
Blakeman test, 318 
Bond, Eva, 352 
Burgess, E. W., 392, 402, 415 
Burt, C. L., 252, 438 


Cc 


Central tendency, 40-62 

Centroid method, 252 

Chi square, 319, 404-423 
distribution of, 407-408, 498-500 
nature of, 404-412 
probability equation for, 409 
relation to F and z, 419-420 
use in contingency tables, 414-417 
use in curve fitting, 417-418 

Communality, 255 


Contingency correlation, 391-393 
Copper, J. A., 453, 460, 467 
Correcting coefficients of correlation, 
for attenuation, 203 
for bias, 152 
for broad categories, 393-399 
for heterogeneity, 208-212 
for overlapping, 212-217 
Correlation, aids for computing, 
501-507 
assumptions in, 109 
bias in, 153 
biserial, 362-366 
* from widespread classes, 384- 
391 
chart for, 100 
correcting for attenuation, 203 
correcting for broad categories, 
393-399 
correcting for heterogeneity, 208- 
212 
correcting for overlapping, 212- 
217 
formula derivation, 94-96 
between gains, 460-463 
intraclass, 201-202 
limits of, 117-118 
mean-square contingency, 391- 
393 
between means, 160-162 
nature of, 91-96 
as overlapping, 118-123 
partial and multiple, 220-225 
predicted for lengthened tests, 
193-196 
produce moment, 91-1 10 
Spearman ranks, 103-109 
spurious, 217 
between squared measures, 181 
between standard deviations, 182 
standard error of, 152-155 


511 


512 


Correlation, between sums of 
samples, 193, 214, 217 
sums and differences formulas, 
101-103 
tetrachoric, 366-375 
from widespread classes, 375- 
384 
translating to 2’, 155 
from unequal intervals, 399-402 
between variates and means or 
index values, 396-399 
Correlation chart, 100 
Correlation ratio, 312-830 
correcting for broad categories, 
323 
formulas for, 312, 316 
partial, 326-327 
relation to analysis of variance, 
353-357, 421-422 
unbiased, 319-326 
Correlation surface, 368, 410 
Cottrell, L. S., 392, 402, 415 
Courtis, S. A., 436, 440 
Covariance, 351, 463 
Craig, ©. C., 443 
Crayton, 8. G., 124 
Curve, Gompertz, 426, 435—441 
growth and decay, 425, 431-435 
normal, 279-311 
normal ogive, 76, 426, 435 
parabola, 425, 429-431 
Pearsonian system, 441-444 
straight line, 1-2, 427-429 


D 


D, standard error of, 151 
symbol for range first to ninth 
deciles, 75 
Degrees of freedom, 349-351, 354- 
355, 412-414, 417-418 
Derivative, definition of, 3 
of the function z”, 7 
of a function of a function, 18 
of an inverse function, 18 
of a logarithm, 20-22 
of the normal curve function, 25- 
28 
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Derivative, of power forms, 22-24 
of a product, 16 
of a quotient, 17 
of a sine, 28-30 
of a sum of functions, 8 
Determination, coefficient of, 117, 
332 
Differentiation, fundamental steps, 5 
meaning of, 3-6 
partial, 30-31 
successive, 13-14, 15, 27-28 
Distribution, of chi square, 407-408, 
498-500 
of epsilon square, 494-497 
Huntington’s approach to, 422- 
423 
normal, 481-487 
of Student’s ¢, 171-176, 421, 488- 
493 
of variance estimates, 349 
of z and F, 420 
Doolittle, M. H., 226 
Doolittle method, 226-238 
Dunlap, J. W., 158 


E 


e (mathematical constant), 19-20 
Eaton, M. T., 359 
Elderton, W. P., 419, 484, 498 
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